SQL Server 2019 Standard Edition Feature Limitations Are Out

The features list by edition is out, and there’s a disclaimer at the top:

This content is being updated for SQL Server 2019. The information in the article is not final.

Which means if you’re really angry, and your company buys a lot of licenses, now’s your time to raise holy hell with your Microsoft reps if there’s something in this doc that disappoints you.

The most important stuff:

  • Standard Edition is still capped at 128GB RAM.
  • Accelerated Database Recovery is in Standard Edition.
  • Automatic tuning, batch mode for row store, adaptive memory grants, adaptive joins, and memory-optimized TempDB are Enterprise Edition only.
  • Automatic inlining of scalar functions isn’t listed, but given the above, it’s probably Enterprise Edition only as well. Update: scalar UDF inlining is now listed, and it’s in Standard Edition! Woohoo!

Update: how I feel about the limitations

Looking back at this, I realize that I didn’t editorialize this at all – I didn’t say how I felt about it. I’m actually totally cool with it except for two things, and I’ll get to those in a second.

Microsoft’s gotta make money just like you do, and they need big differentiators between Standard Edition and Enterprise Edition. They need to have serious reasons as to why you’d spend the extra $5K per CPU core. I think automatic tuning, batch mode for row store tables, adaptive joins, and memory-optimized TempDB are all fair limitations. If you’re hitting the kinds of performance issues where you need those features in order to survive – not just that you want them, but that you have to have them – then you should probably pony up for Enterprise Edition.

The vast, vast majority of shops have survived just fine without those features for decades. Sure, you’ve hit performance limitations that have caused you to do some performance tuning, and those took you some time. Maybe even a lot of time. But that’s the value proposition of Enterprise: when you hit big performance issues, you could EITHER tune it manually yourself, or you could pay Microsoft to improve the performance for you. I think it’s a fair trade.

First, Basic Availability Groups are a joke. I know, they’re supposed to replace database mirroring, but the requirement of a separate Availability Group for every database, and a separate listener, and a separate IP address – those are just ridiculous. Let people put multiple databases in the same Basic Availability Group and fail them over together. (I’m not asking for multiple replicas or read replicas – I get that both of those are great differentiators for Enterprise. I know, some readers are gonna say they need multiple replicas for both high availability and disaster recovery, and I don’t think that’s unreasonable, but I’m not fighting that battle today.)

Second, there’s one area that’s unforgivable: memory grants. SQL Server itself makes horrible, horrible decisions around memory grants, especially around over-estimation. Standard Edition has the perfect storm of problems:

  1. It’s capped at 128GB RAM (especially around query workspace)
  2. You can’t use Resource Governor to cap query grants
  3. SQL Server can’t learn from its mistakes because Standard doesn’t get adaptive grants

I would totally understand if we had ANY way at the system level to fix SQL Server’s bad memory grant decisions, but the only option we have is by changing the queries themselves (by rewrites or by adding grant hints.) Microsoft should give us SOME way to fix these bad grants at the system level – and putting any one of the 3 above options in Standard would be fine.

Or, you know, they could fix the query optimizer to stop granting so many queries 25% of the buffer pool every time they run a query. But let’s be realistic here.

How to Think Like the SQL Server Engine: Adding a Nonclustered Index

When we left off in the last post, our users kept running this query, and they want it to be really fast:

Let’s pre-bake the data by creating a copy of the table sorted in a way that we can find the right rows faster:

This builds a separate copy of our table (also stored in 8KB pages) that looks like this:

This index slows down inserts and deletes.

The first thing to notice is that you now have two physical copies of the table. Every time you insert a new row, you have to add it in two places. Every time you delete a row, you have to delete it in two places. You’ve effectively doubled the amount of writes that your storage has to do. (Updates are a little trickier – more on that in another post.)

The second thing to notice is that the nonclustered index (black page) is denser: you can fit more users per 8KB page on that because we chose to store less fields. You can see that by running sp_BlitzIndex and looking at the top result set, which shows how big each index is:

The clustered index (marked CX PK) has ~300K rows and takes up 58.1MB space on disk because it contains all the columns of the table.

The nonclustered index has the same number of rows, but it only takes up 5.3MB space because we’re only storing LastAccessDate and Id on it. The more columns you add to an index – whether they’re in the key, or in the includes – they make the object larger on disk. (I’ll talk more about index design as this series continues.)

But this index pays off dramatically for selects.

Try our select query again – and here, I’m running two queries back to back. I’m running the first query with INDEX = 1 as a hint to show the cost of the clustered index scan. (Index #1 is your clustered index.)

Our new bottom plan is much simpler:

  • We get a seek, not a scan – because SQL Server is able to jump right to the rows where LastAccessDate > 2014/07/01. It doesn’t need to scan the whole object to find ’em.
  • We don’t need a sort – because as we read the data out, it’s already sorted by LastAccessDate.

The index seek’s cost is lower for two reasons: it doesn’t need the sort, and it reads less 8KB pages. To see how much less, let’s check out the messages tab:

The clustered index scan read 7,405 pages and took 110ms of CPU time to do the ORDER BY.

The new nonclustered index only read 335 pages – like 20X less! – and didn’t need CPU time for ORDER BY. Really good index designs like this are how you can quickly get 20X (or much higher) improvements in your queries without rewriting the queries. Do I wish every query was perfectly tuned? Sure – but even if it was, you’re still gonna need to give the engine a helping hand by organizing the data in more searchable ways.

This index is called a covering index.

The index perfectly covers this query, satisfying everything it needs in a quick, efficient way. The term “covering” isn’t a special kind of index that you create with different syntax – “covering” just refers to the combination of this index AND this query. If you change the query at all – like we’re going to do here shortly – then the index may no longer be covering, and we’ll have to do more work again.

How to Think Like the SQL Server Engine: Running a Query Repeatedly

Earlier in this series, we ran a query with ORDER BY, and we realized that it was CPU-intensive work that tripled the cost of the query:

Now, let’s run that query a bunch of times. In SSMS, you can add numbers after the GO, and SSMS will run your query a number of times. I’ll run it 50 times:

I explained SET STATISTICS IO ON in the first post, but I’ve added a new option here: TIME. This adds more messages that show how much CPU time and elapsed time the query burned:

SQL Server executes the query over and over, reading the 7,405 pages each time, and doing the sort work each time.

When I was a developer, I used to be totally okay with constantly fetching data from the SQL Server. After all, I’d just run the query before, right? The data’s in cache, right? Surely SQL Server caches that sorted data so that it doesn’t have to redo all that work – especially when my queries had been doing a lot of joining, grouping, filtering, and sorting.

SQL Server caches raw data pages,
not query output.

It doesn’t matter if the data hasn’t changed. It doesn’t matter if you’re the only one in the database. It doesn’t even matter if the database is set to read-only.

SQL Server re-executes the query again from scratch.

And similarly, if 500 people are running the exact same query at the same exact time, SQL Server doesn’t execute it once and share the results across all of the sessions. Each query gets its own memory grant and does its own CPU work.

This is one of those areas where Oracle has us beat. Sometimes folks will ask me what my favorite database is, and I gotta confess that if money didn’t matter, I’d probably be really interested in Oracle. Check out their Result Cache feature: you can configure a percentage of memory to cache query results to be ready-to-go when apps keep rerunning the same query. However, at $47,500 per CPU core for Enterprise Edition licensing (check out the price sheets), I’m afraid I’m not going to be sampling any of that caviar anytime soon.

One way we can solve that problem is by caching the results in the application layer:

But another way – and the way I use most often – is to pre-bake the data in a way that makes queries run faster. That’s a nonclustered index, and I’ll cover those next.

How to Think Like the SQL Server Engine: The Perils of SELECT *

In our last post, we ran a query with an ORDER BY, but we only got one column in the SELECT:

The estimated cost was about $18 Query Bucks because SQL Server had to:

  • Scan the entire clustered index, yelling out the Id and LastAccessDate of each row
  • Sort that list by LastAccessDate

Now, let’s change just one thing about the query – what we’re selecting:

And run both of the queries back to back to get their actual execution plans:

The basic execution plan of both queries is pretty similar, aside from the fact that the bottom one went parallel. SQL Server realized that sorting all this data was going to be a heck of a lot more work, so it split the work across multiple CPU cores.

This query sucks in a few different ways.

SELECT * can read more data. I know what you’re thinking: both queries have to read all of the 8KB pages in the table, right? No – go back to the first post in the series when I introduced the Users table. I mentioned that the AboutMe field, a big ol’ NVARCHAR(MAX), might be so large that it ends up getting pushed off-row: stored in different 8KB pages. Our SELECT Id query didn’t need to read those extra pages – but when we SELECT *, we do.

SELECT * can sort more data. SQL Server can’t just sort the LastAccessDates – it sorts the whole rows around. That means it’s going to need more CPU time, more memory, and do more spills to disk if it gets those memory estimates wrong.

SELECT * can take more time to output. This was always the thing I focused on as a database administrator. I would tell my devs, “Don’t select fields you don’t need, because it takes longer to yell that data out over the network.” Today, that’s the least of my problems: I’m much, much more concerned about the CPU time and memory grants consumed by the sort operator.

Your first clue about just how bad the SELECT *’s sort operator sucks is the 97% cost of the sort – but it’s not 97% of the same query cost. The original query cost was $18 Query Bucks, but check out the cost of the SELECT * – IT’S ALMOST NINE HUNDRED QUERY BUCKS.

If query bucks were real dollars, I could have bought my first car with the cost of this SELECT *.

I hate SELECT *.

It isn’t about the star itself – it’s about lazily getting all of the columns, including ones you don’t need, and then forcing SQL Server to sort them.

And just because you didn’t put an ORDER BY in your query doesn’t mean you don’t have a sort, either – Erik wrote about sorts that get injected into your plan even if you didn’t ask for ’em.

For the next post in the series, we’ll go back to selecting just the Id. It’s not that I’m only going to allow my developers to just select Ids and nothing else – I understand that we gotta get data out of the database. However, when I’m dealing with query tuning, and I see a SELECT * (or a big ginormous list of columns), I’m going to start the tuning process by asking if we really need all of those fields. Even with Entity Framework, you can pick the columns you want. Does it take a little more work on your part? Sure – but you’re a lot cheaper than SQL Server’s CPU licensing or Azure SQL DB’s pricing, especially as your app starts to scale.

[Video] What’s New in SQL Server 2019

We must be getting pretty close to the official release of Microsoft SQL Server 2019. There are less than 80 days left in the year. (Interestingly, it’s been almost 60 days since Release Candidate 1 shipped – that’s unusually long for a delay between builds. Hmm.)

With the clock ticking down, now’s a good time to take a 40-minute look at some of my favorite new features: deferred compilation for table variables, adaptive memory grants, adaptive joins, and air_quote_actual plans:

To learn more about this stuff:

How to Think Like the SQL Server Engine: Adding an ORDER BY

We started out the How to Think Like the Engine series with a simple query with a WHERE clause:

Now let’s add an ORDER BY:

Here’s the updated plan – note that the query cost has tripled to $17.72 Query Bucks. Let’s dig into why:

We read the plan from right to left, hovering our mouse over each operator. Each operator is kinda like a standalone program that has its own dedicated work to do, and produces a specific output.

The first thing that happens is the Clustered Index Scan.

At the top right, the clustered index scan is exactly the same as it was in the last query. Hover your mouse over that and there are a lot of juicy details that we didn’t really dig into before:

  • Predicate: the clustered index scan mini-program is only going to return rows where the LastAccessDate > ‘2014-07-01’. (Pretty nifty how it changed the date format and added the time, right?)
  • Estimated Number of Rows: 149,259
  • Estimated Number of Rows to be Read: 299,398 (we have to scan the whole table to find the people who match)
  • Output List: Id, LastAccessDate (because upstream of this mini-program, other mini-programs are going to need both Id and LastAccessDate. Specifically, the Sort operator – which happens next – is going to need to sort all these rows by LastAccessDate.)

The data flows out of this scan operator, and flows into the next one: Sort.

The second thing that happens is the Sort.

The sort’s input is the 148,328 rows of Id & LastAccessDate that came out of the Clustered Index Scan, and they’re not sorted in any kind of order – but our query asked for them to be ordered by LastAccessDate. That’s where the Sort’s work comes in. Hover your mouse over it to see what’s happening in that mini-program:

A few points of interest:

  • Order By (down at the bottom): the main goal of the sort
  • Estimated Number of Rows: 149,259
  • Estimated I/O Cost: $0.01 (because there’s not much I/O to do if you’re sorting 150K rows)
  • Estimated CPU Cost: $11.7752 Query Bucks (because sorting is mostly CPU work)

But note that those estimates above are all based on 149,259 rows. If way more rows came in (or less), then our actual work would have been way more (or less.) This is a good time to stop and mention that you don’t see Actual Cost numbers here in Query Bucks: SQL Server doesn’t go back and re-cost stuff after the work has been completed. Anytime you see a cost – even on an actual plan – it’s just an estimate that was done before the query started. Just like you, SQL Server doesn’t later confess just how over budget or late its work was.

So why did the query cost triple from $6 to $18?

It all comes back to the $11.7752 Query Buck cost of the Sort operator, which is brand new in this plan. Sorting data is hard work.

I jokingly say that SQL Server is the world’s second most expensive place to sort data – second only to Oracle. Next time someone complains about the $7K per core cost of SQL Server Enterprise Edition, remind them that Oracle is $47K per core. $47K. Hoo boy.

If you need to sort data, try doing it in the app tier instead. Developers are really good about scaling out application work, and their app servers don’t cost $7K per core. I’m all about going to bat for my developers to get them more & faster web servers because those are easier/cheaper than scaling out sorting in SQL Server. If the query doesn’t have a TOP, then it probably shouldn’t have an ORDER BY.

What’s that, you say? You’ve never heard Microsoft dispense that advice?

Microsoft, the company that charges $7K per core to do your sorting for you? They haven’t told you about that? Huh. Must have been too busy about telling you how you can now do R, Python, and Java inside the SQL Server, all at the low low price of $7K per core. Funny how that works.

And the more columns you add, the worse it gets. Next up, let’s try SELECT *.

Why Query Plans Can Look Different on Different Servers

In the first post in my How to Think Like the Engine series, I started by doing a pretty simple query:

But I noted that if you were following along on your computer, you might see a different execution plan. If I tweak just one thing about my SQL Server, I see a different plan:

Fully optimized plan

That’s the SAME query, hitting the SAME table with the SAME data, on the SAME server, with the exact same data – but suddenly now the query plan goes parallel (as indicated by the Parallelism icon), and I get a missing index request. What gives?

In this case, I changed Cost Threshold for Parallelism.

CTFP is a server-level setting that says, “If a query costs more than this, then consider parallelizing the query’s work across multiple cores.” Most SQL Server setup checklists will suggest that you raise it from the default of 5 up to something more like 40, 50, or 75. I explain the setting in more details over in my post, What is the CXPACKET Wait Type?

In the first screenshot – where the query goes single-threaded – I had my CTFP set at 50, which means that SQL Server wouldn’t consider parallelizing it if it had a lower cost. If I hover my mouse over the first query’s SELECT operator in the execution plan, I can see the Estimated Subtree Cost:

Estimated Subtree Cost

Estimated Subtree Cost is SQL Server’s guess of how much CPU and IO work will be required to execute the query. SQL Server doesn’t know how fast your CPUs are or how fast your storage is, and these numbers are hard-coded to kinda old hardware. We needed a way to communicate what these numbers mean, though, so Kendra Little came up with the term “Query Bucks.”

This query costs $5.79 Query Bucks, which is less than our $50 Cost Threshold For Parallelism, and it was a pretty simple query to optimize, so SQL Server bailed out early on building a query plan. If you right-click on the SELECT operator and click Properties, you can see that in the Optimization Level property being TRIVIAL.

SQL Server is in a race against the clock: the faster it can ship your query results, the happier you’ll be. There was no sense putting more time into building this query plan since it was seen as trivially simple – any additional time spent compiling the plan would be just wasted.

When I dropped Cost Threshold for Parallelism back down to the default of 5:

  • The query cost of $5.79 was suddenly higher than CTFP ($5)
  • SQL Server went deeper into query optimization (the Optimization Level switched to “FULL”)
  • Parallelism was injected into the plan
  • The missing index recommendation code kicked into action

And now, SQL Server decides to build a better query plan for my configuration settings – that part is really important to understand:

The cost is now HIGHER – $5.87 Query Bucks – but SQL Server believes it’ll complete more quickly by spreading the work across multiple threads. It might be wrong – but it would be wrong based on YOUR input, which is what drove the Cost Threshold for Parallelism setting. By itself, SQL Server just doesn’t know your workload well enough to decide which queries should go parallel and which ones should stay multi-threaded in order to balance load as efficiently as possible across multiple cores. It needs your tender loving hand on the CTFP dial.

Change anything about the server, and your plans can change.

Oh sure, there’s the obvious stuff:

  • Server-level configuration settings (like sp_configure options)
  • Hardware capacity (core count and memory availability change plans)
  • SQL Server major version (2019, 2017, 2016)
  • SQL Server patch level (yes, even Cumulative Updates change optimizer behavior)
  • Data quantity – how many rows you have in tables
  • Data distribution between different values

But that’s really only scratching the surface. SQL Server is incredibly complex, and has an amazing number of ways that you can influence query plan behavior without even meaning to. For more examples, check out 5 Ways to Change Execution Plans Without Tuning.

This has implications for how you do performance tuning.

Go easy on untested changes. Just because you believe that changing a setting is safe doesn’t mean it won’t have unexpected performance implications.

Try to keep production & development servers as identical as practically possible. The closer your servers match, the closer your query plans will match. If production is SQL Server 2017 with 48 cores and 2TB RAM, but development is SQL Server 2019 with 4 cores and 16GB RAM and a tiny subset of production data, you don’t stand a snowball’s chance in hell of getting execution plans that are even remotely similar. You’re going to have a much harder – not impossible, just harder – chance of getting identical query plans. But at the same time…

Most of the time, you don’t need exactly identical query plans. Generally, a crappy query is a crappy query regardless of the SQL Server version or settings. In this blog post series, for example, if you’re following along with my queries, you don’t need to get exactly the same query plan in order to get the point. Just focus on the basics: is the query getting better or worse?

I kinda feel like this is going into more technical detail than really necessary to introduce you to internals in the How to Think Like the Engine series, but I needed to get that out of the way for those of you who wanna follow along with the query plans on your own machines.

In the next post, we’ll get back to expanding our query with more work and seeing how the plans change.

How to Think Like the SQL Server Engine

You’re a developer or a DBA, and you’re comfortable writing queries to get the data you need. You’re much less comfortable trying to design the right indexes for your database server.

In the next several posts, you’ll learn:

  • The differences between clustered and nonclustered indexes
  • How (and when) to make a covering index
  • The basics of execution plans

If you prefer watching videos rather than reading, you can watch my How to Think Like the Engine training videos free too. I just decided to write these out as a series of blog posts because I get so many questions about how this stuff works – from folks who don’t like sitting around watching videos. Let’s get started!

I’m using the Users table in the Stack Overflow database.

StackOverflow.com open sources your question & answer data, and I import that into a few free SQL Server databases you can download. In this series, I’m going to use the small 10GB StackOverflow2010 database (1GB zip file) – if you want to follow along and get similar results & query plans, use that one.

The Users table holds exactly what you think it holds: a list of everybody who’s created an account at StackOverflow.com. The schema is pretty simple:

  • Id – an identity field, starts at 1 and goes up to a bajillion
  • DisplayName – not unique, just the name you go by, like “Brent Ozar” or “Alex”
  • LastAccessDate – the last time you opened a page at StackOverflow.com

In SQL Server, most objects are saved in 8KB pages, and each 8KB page is dedicated entirely to one object. (We’ll save the nuances of columnstore indexes, Hekaton, and other niche objects for other blog posts.) You can think of these 8KB pages just like printed-out pages of an Excel spreadsheet, like this:

dbo.Users clustered index

When I teach this class in person, I actually give out printed-out pieces of paper with this stuff. You can download that 5-page PDF and print it out to help follow along and visualize the data as we go through demos. Just stick with the white piece of paper at first.

That white grid above is the clustered index of the dbo.Users table. In the Stack Overflow databases that I publish, the Id column is set up as the clustered primary key of the table.

This clustered index on Id *is* the table.

Look down the left side of the sheet: the rows are sorted by Id, and the sheet has all of the columns in the table. The clustered index IS the table, and these 8KB pages are the same whether they’re on disk or in memory.

Each database’s data files (MDF/NDF) are just a string of 8KB pages from start to finish. Some of the pages are used by SQL Server to save metadata about the database, but most of ’em are your indexes.

You can see how many pages a table contains by running a query against it like this:

The first line there, SET STATISTICS IO ON, adds info in SQL Server Management Studio’s “Messages” tab that tell you the number of 8KB pages SQL Server had to read to execute your query:

The “logical reads 7405” means SQL Server read 7,405 8KB pages. Generally speaking, the less pages SQL Server has to read to execute your query, the faster your query will go.

7,405 pages is about 15 reams of paper.

You know those 500-page packs of paper that you put into the copier or the printer? (No? Do you remember copiers and printers? Honestly, me neither.) The Users table is one of the smallest tables in the Stack Overflow database export, but it’s still 15 of those packs.

As we work through demos in the upcoming posts, I want you to visualize a stack of 15 reams of paper over in the corner of your room. When I ask you to query the table, I want you to think about how you’d execute that as a human being facing data spread across 15 reams of paper. It’d be a hell of a lot of work, and you wouldn’t be so eager to go grab the first piece of paper to start work. You’d wanna build a really good plan before you go tackle that stack of paper.

That’s why SQL Server tries to build good execution plans.

In SSMS, click Query, Include Actual Plan (or hit Control-M), run the query again, and you’ll get an “Execution plan” tab:

Read the plan from right to left: SQL Server scanned the clustered index (aka the white pages, the table), pushing 299,398 things out to the SELECT statement. It had to scan the whole table because we asked for the whole thing – well, we only asked for one column (Id), but the Ids were scattered across all the pages, so we had to read ’em all.

You wouldn’t normally write a query without a WHERE clause though, so let’s add one.

Let’s be good and add a WHERE clause.

Let’s only get the people who accessed the system after July 1, 2014:

And my execution plan looks the same:

(If you’re following along on your own computer, your query might look a little different – I’ll explain why in the next post.)

Both of my queries – with and without a where clause – read the same number of pages. Here, I’m running both queries back to back, showing their stats in a single Messages tab:

And if you look at the 8KB page again, it kinda makes sense why we have to scan the clustered index in order to find rows with a LastAccessDate > 2014/07/01:

dbo.Users clustered index

Even though our query returns less rows, we still have to do the same amount of work because the data simply isn’t sorted in an order that helps our query. If we’re going to frequently query by LastAccessDate, and if we want that query to run faster, we’re going to need to help SQL Server out by designing an additional copy of our table, pre-sorted in a better way: a non-clustered index.

Disclaimer: I’m simplifying a lot of stuff here.

In this blog post series, I’m going to try to cover a lot of the most important ground, fast. To do that, I’m going to leave out a lot of details that you would probably find interesting – but look, this is a blog post, not a book. It’s a starting point for your learning journey.

However, there are a few disclaimers I need to make or else the armchair bloggers are going to complain:

  • Note that the AboutMe column is clipped off. The AboutMe column is an NVARCHAR(MAX) because Stack Overflow lets you store a huge amount of stuff inside your profile. In SQL Server, large AboutMe data can be moved off-row: saved on separate 8KB pages. When you design tables with big columns like VARCHAR(MAX), NVARCHAR(MAX), XML, JSON, etc., you can end up paying a performance price as SQL Server jumps around to read all that data. Ideally, design columns just wide enough to store the data you need, and no more.
  • The 8KB pages don’t use Excel-style grids. SQL Server needs to cram as much data in per page as it can, so no, it doesn’t organize rows and columns on the 8KB page like a spreadsheet. Similarly, the column headers aren’t stored in plain text on each page, nor are nulls aren’t stored with all capital NULL, hahaha. I’m just using a spreadsheet as a visualization tool because it helps you understand how things work.
  • The 8KB page I’m showing is called a leaf page. Indexes have more kinds of pages too, like B-tree structures that help the engine rapidly find the right page when it’s seeking for a specific row. However, even if you just focus on the leaf pages and don’t learn anything else, you can still do a phenomenal job of improving your database performance – and that’s the goal of this series, getting you up to speed on the most important stuff quickly.

And there’s one more complex thing I need to tackle, but it needs its own blog post, which is the next one in the series: why your query plan might not have exactly matched mine.

I’m coming to Gothenburg & Oslo next summer.

The Nordic countries are some of the most beautiful places I’ve ever seen. Erika and I are scheduling a few trips over there in 2020, and the first one I can announce is SQL Saturday Gothenburg and SQL Saturday Oslo.

I’m teaching my one-day workshop before each event: Performance Tuning in 21 Demos. You’re a developer or DBA who wants to tune indexes and queries to make your application go faster. You don’t like learning by watching slides or reading white papers – you want to jump right into the code and see what happens. You’re the kind of person who loves watching YouTube videos that start with, “Hold my beer and watch this.” Attendees get a year of my Consultant Toolkit.

These pre-cons are an awesome deal for one day of in-person SQL Server training, and they usually sell out. I’m announcing these a year in advance so that those of you who want to travel can schedule it in advance:

Erika and I are making a big vacation out of it, flying in a week early to take a Hurtigruten cruise around Norway, then taking the Bergensbanen train over to Oslo, reputed to be one of the most beautiful train rides in Europe.

You can even watch the entire seven hour journey courtesy of the slow TV movement, something I’ve found really wonderful lately:

See you in Scandinavia!

New Free Azure and SQL Server 2019 Training from Microsoft

Folks often email me and ask, “Brent, why don’t you do training classes on Azure?” The answer is simple: the products change faster than I can recoup the cost of building the training material. By the time the class gets enough revenue to pay off the downtime for building an Azure (or AWS or Google) training class, the material’s out of date. I’m sure you’ve sat through an online video where you’ve noticed things that are already no longer true.

But there’s a company who CAN afford to constantly build new stuff without charging you for it: Microsoft!

The sqlworkshops Github repo is chock full of courses built by Microsoft names that you’ll probably recognize like Bob Ward and Buck Woody. It’s all open source, and written in a way that you can follow along. Now, granted, that’s not necessarily as good as an instructor-led class, because you’ve gotta teach yourself, but … you don’t have to pay for an instructor, so there’s that.

Their latest course is a one-day workshop called SQL Server 2019 Ground to Cloud, and it covers:

  • What’s new in SQL Server 2019
  • Working with Big Data Clusters
  • SQL Server in Azure – the differences between Azure SQL DB, elastic pools, VMs, etc.
  • Migrating data from SQL Server into Azure
  • Which databases to use for different business requirements

You can download the 219-page workshop PDF here, get the PowerPoint slides, and browse the Github repo with the class resources. Don’t be fooled by the times listed to complete each module: if you’re doing these classes yourself, they’re going to take you more time, and that’s not a bad thing. Lots of the slides in these presentations are going to prompt you to jump in a different direction and read up on related features & documentation that you want for your own business projects. That’s a good thing: it means this “one-day workshop” is really something you can dive into for several days, or a few weekends.

If you’re a consultant, your clients are probably going to ask you questions that are covered in these workshops. Microsoft *wants* you to be able to present these workshops to your clients to get ’em the answers they want. These workshops aren’t just about empowering your self-paced learning: they’re also about empowering you as a consultant and/or trainer to get the knowledge out there.

And if you like this, you’ll probably like the rest of the sqlworkshops Github repo. It’s all free – you can’t beat that. (I certainly can’t!)

Updated First Responder Kit and Consultant Toolkit for September 2019

This month’s release has a lot of new feature goodness: the Consultant Toolkit can automatically upload the results to Amazon S3, plus sp_BlitzCache adds a new sort order to better catch unparameterized queries burning up CPU, a new warning for selects that are doing writes (other than >500MB spills, which we’ve been warning you about for years), and support for SQL Server 2019’s air_quote_actual plans.

To get the new version:

Consultant Toolkit Changes

Updated the First Responder Kit with this month’s script updates, plus:

  • Improvement: the –upload command line switch now uploads the data directly to a bucket in your Amazon S3 account. This helps you set up a scheduled task at the client’s office to send you the data weekly or daily – enabling you to keep tabs on their SQL Server’s health without having to constantly VPN in. See the S3 uploading documentation for more details.
  • Improvement: added Plans by Query Hash tab using sp_BlitzCache’s new @SortOrder = ‘query hash’ option. This is basically the same as @SortOrder = ‘cpu’, but it filters for queries with multiple plans in the cache with the same query hash.
  • Fix: on the Databases tab, on the Total line, the “# of Data Files” and “LOB Reserved GB” columns were swapped. This only affected the Total line though, not the individual databases.
  • Fix: on the Databases tab, data sizes are now populated for databases with NVARCHAR names.
  • Fix: formatting improvements.

sp_Blitz Changes

  • Improvement: added Uninstall.sql for those of you who wanna nuke it from orbit. (#2080, thanks ShawnPOWER for the request and Emanuele Meazzo for the code.)

sp_BlitzCache Changes

sp_BlitzFirst Changes

  • Fix: @CheckProcedureCache was being skipped for some time zones, especially British Standard Time users. I had to fix this before stepping foot on the Isle of Man again, naturally. (#2096, thanks PlanetMatt for the delightfully easy-to-reproduce bug report.)

sp_BlitzIndex Changes

  • Improvement: faster compression analysis on partitioned databases. (#2065, thanks Erik Darling.)

sp_DatabaseRestore Changes

  • Fix: last month’s release broke on case-sensitive systems, fixed. (#2097, thanks sm8680 for the bug report.)

For Support

When you have questions about how the tools work, talk with the community in the #FirstResponderKit Slack channel. If you need a free invite, hit SQLslack.com. Be patient – it’s staffed with volunteers who have day jobs.

When you find a bug or want something changed, read the contributing.md file.

When you have a question about what the scripts found, first make sure you read the “More Details” URL for any warning you find. We put a lot of work into documentation, and we wouldn’t want someone to yell at you to go read the fine manual. After that, when you’ve still got questions about how something works in SQL Server, post a question at DBA.StackExchange.com and the community (that includes us!) will help. Include exact errors and any applicable screenshots, your SQL Server version number (including the build #), and the version of the tool you’re working with.

DBA Training Plan 23: Congratulations! Now, It’s Your Turn.

I bet you think you suck at databases.

You’re sitting there on your computer, reading all kinds of blogs, watching presentations online, and you’re thinking, “Everybody knows more about databases than I do.”

We all feel that way. There’s even a name for it: Impostor Syndrome.

I have it too. For example, about a decade ago, when I headed to Redmond to attend Microsoft’s $20,000 Certified Master program, I kept thinking, “Everybody else there has gotta be different than me. I’m not really supposed to be there, for sure, but at least I’ll be able to spend a few weeks rubbing elbows with people who really are good with databases.” The instructors told us that the program was designed so that when Microsoft had a really ugly customer support situation, they could send out a Microsoft Certified Master, and know for sure that the problem was gonna get handled, no matter how severe the problem.

When I passed all 3 exams and the lab, my opinion of myself still didn’t change. I still thought I sucked at databases.

Today, in the year 2019, I still think the same way: I don’t know enough. It’s just that now, I can say that I’ve made a lot more mistakes, and I just like to help other people avoid the same mistakes that I’ve made.

You’re just like me.
Except you need to start sharing.

If you’ve been reading my blog posts, even just the 20-some posts in my DBA Training Plan, you already know a lot more than most database administrators. Seriously.

What, you think other people with your job aren’t Googling their way through their daily work? Go read the questions at DBA.StackExchange.com or StackOverflow.com. We’re all winging this together. We’re all facing an overwhelming number of databases, applications, error messages, and email alerts. We’re all trying to find help in any way that we can get it.

You can offer some of that help.

There are so many ways you can give back to the community: blog, present, answer questions, record YouTube videos, chat on SQLhelp, or contribute improvements to open source tools.

Giving back pays you back, too.

Oh sure, there’s a fantastic feeling when you see the light go on in someone’s eyes because they suddenly grasp a new concept, or when you watch the web site visitor numbers go up. (Slowly. Very slowly.) But there’s much more than that, and they’re selfish benefits, but I’m going to tout them here in an effort to win you over into the Contributor Camp.

You get recognized when in the speaker room. When you introduce yourself at events, people will recognize your name and thank you for your contributions – because the speakers needed your help, too.

People will start seeking you out. They’ll say hi to you at events, follow you on Twitter and Instagram, and get to know you as a person. You’ll build relationships that will last the rest of your life.

Companies will start seeking you out, too. I can’t emphasize enough how important this is for your career. Go read Rock Stars, Normal People, and You to learn how it turned my own job hunting around. If you’ve already read that, go re-read it again. No, I’m not fishing for web site hits – I’m doing just fine, thank you, hahaha.

People will leave comments. Well, okay, this one’s kind of a mixed bag, because sometimes they thank you for all your hard work to make their life easier, but sometimes…sometimes they won’t.

Yes, people will criticize your work.
It happens to all of us.

Sometimes, they’re going to point out an error in your work. You’re going to feel like an idiot, but…you already feel like an idiot, right? The only difference is, it’s going to happen in public. But “public” is just your blog or a Stack Overflow answer or a Github issue. It’s not like it’s happening on national TV. We’re a tiny industry, and nobody’s gonna remember that one time you made a mistake – especially if you handle it by testing your work afterwards to double-check, thanking the commenter, and integrating the correction into your work.

Other times, they’re mistaken, and this is kind of a mixed bag too: sure, you can be proven correct, but you still have to go the extra mile of working with the commenter to explain yourself in more detail.

Other times, there’s no right or wrong answer – the work is a matter of opinion – and that can be kinda grueling, too. But even then, be excited and thankful because your work prompted them to get up off their rears and leave a comment.

Let’s do this.

Don’t let the prospect of comments trigger your impostor syndrome: you can do this. It’s time to decide how you’re going to give back and make a plan to pull it off. To help, here’s my PASS Summit 2016 half-day session, 500-Level Guide to Career Internals:

DBA Training Plan 22: Where to Learn More

Whew. We’ve worked through a couple dozen posts together in this DBA Training Plan, and you’re still just getting warmed up.

First, pick your specialty. Here’s some of the more popular specializations in the SQL Server business:

  • Production administration – managing uptime, building new boxes, troubleshooting clusters, setting up storage, designing virtualization infrastructures
  • Performance tuning – taking queries from others and making them faster via rewriting the queries, tuning indexes, and tweaking SQL Server settings
  • Business intelligence – working with end users to get what they need from the mysterious database using SQL knowledge plus visualization tools
  • SQL development – building long, complex stored procedures and functions to transform and return data

There’s no wrong answers here because all of these will have great long-term markets. You need to pick the one that calls the most to you because you’ll need to spend your spare time learning it. It’s rare to find a company willing to invest in your future skills, so your career needs to be in a specialization that you’re genuinely excited to study. After all, it’s hard to suck up and study after hours when you don’t really love the topic.

Next, pick your favorite learning method:

Reading – books, articles, blog posts. Just be aware that the shorter the delivery method, the more you’re probably going to have to consume to get the desired effect. My favorite free reading resource that walks the line between free/short and expensive/long is SQL Server Central’s Stairways. They cover a single topic well from start to finish, and they’re arranged in single-serving chunks. It’s about as deep as you can get without paying for a book, but when you’re ready for books, check out the free book library from Red Gate.

If you want an absolute firehose of blog posts, here is a Feedly collection of the blogs I subscribe to, and an OPML file that you can import into your feed reader.

Watching videos – aside from our YouTube channel, my next favorite free video resource is SQLbits. It’s a UK conference that records videos at all of their sessions and makes them freely available to the public online. You can also search by speakers.

In-person training classes – start by finding your local chapter of PASS, the Professional Association for SQL Server. These local user groups meet monthly, and volunteer speakers talk about a topic for about an hour. Then search for upcoming SQLSaturdays – they’re free one-day conferences with multiple tracks on SQL Server. You may have to travel to get to one, but it’s still cheaper than any other one-day training class options.

When you’ve exhausted those free resources or you’re ready to dive deeper into a topic, we’re here. Sometimes you need to dive deeper into a topic than you can get in blog posts or community videos, and that’s where our paid options come in. For example, our Senior DBA class covers advanced real-world experience on high availability and disaster recovery topics, and our SQL Server Performance Tuning class shows you how to save valuable time and make the server go faster immediately. Learn more about our training now.

DBA Training Plan 21: Building a New SQL Server

I have a secret setup checklist.

Oh sure, our free First Responder Kit has a SQL Server setup checklist that walks you through a few things you need to do before you install, plus a few things to do immediately afterwards. That’s good. It’s actually pretty darned good.

But it’s not the secret checklist.

Gather round and cover your monitor for a minute because I’m about to share the good stuff.

1. Hardware and Windows prep. Apply the latest bios and firmware to the hardware. Install Windows, then apply all patches. Install and configure your server vendor’s hardware monitoring, and the monitoring tools your sysadmins use. For clusters, run the Validation Wizard, make sure all tests pass, and save a copy of the validation report.

2. Smoke test it. Run CrystalDiskMark against all of the storage (local and SAN) and make sure it meets your expectations. (See my Building the Fastest Servers session for more details.) If it’s a physical server, test it by removing and/or disabling the network, storage, and power cables individually, and make sure that all of the server’s connections are fully redundant. If it’s a virtual server, do those same things to the host. Yes, your VM admins are going to be concerned – but honestly, they haven’t tested any of this stuff in the last year either. They’re just assuming it’s going to work, and you know what they say about assuming: you make an ass out of you and Uma Thurman.

3. Install SQL Server and the latest CUs. Here’s where our setup checklist comes in handy, but you’re not done. Restore your existing databases and time how long that takes. It’s your one chance to know how long a production database restore will take on the new hardware, and that helps you find out if you can meet your RPO/RTO goals. Configure your maintenance jobs like with Ola Hallengren’s scripts, and time how long they take. Try performance tuning the backups by striping them across multiple files – often splitting the backups across 4 files will make them complete 4x faster. Try performance tuning the index rebuilds by doing sorts in TempDB and see if that helps.

4. Smoke test SQL Server. Set up your high availability and disaster recovery features like log shipping or AlwaysOn Availability Groups, and fail the databases over between production and DR. Test a planned failover and document all of the steps involved along with how much time they take. Include copious screen shots. Your goal is to make a planned failover easy enough that any of your company’s admins can do it without knowing SQL Server. (Hey, the more you document, the more time you can take off for vacations.) Then try the same with an unplanned failover – yank the power cables out of the production servers (or hosts) and do a failover. This is usually a more manual process with more steps.

5. White-board out the failure possibilities. Diagram out all of the moving parts in the architecture – shared drives, cluster IPs, service accounts – and think about what would happen if each part fails. What would failure symptoms look like? What troubleshooting steps would help you discover that this was the real root cause? If you’re not sure how you’d recognize a failure of that component, now is your chance to force the failure and plan out your troubleshooting fails.

Only then do you start to go live on the new infrastructure.

This sounds like a lot of work, and it is. I’m not saying this as a consultant who wants to bill you a bunch of money – this isn’t usually the kind of work that I tackle myself. Instead, I work with you to build a list like this, and then you go off and execute the plan. You’re already comfortable with each of these tasks in theory – you just need to do them hands-on. Don’t just assume that the hardware will react a certain way when there’s a failure – go test it, find out for sure, and when disaster strikes, you’ll react with calm confidence.

DBA Training Plan 20: Planning Your Next Server

When the business decides it’s time to give up fixing the old server because it’s not fast enough or reliable enough, stop. The worst thing you can do is jump directly to the hardware vendor’s site, check a few boxes, and order a new shiny server that ends up with exactly the same problems as the last server.

1. Get the business’s RPO and RTO goals in writing. These two numbers determine whether you build a cluster, an AlwaysOn Availability Group, a log shipping backup, or a pair of tin cans connected with string. Learn what RPO and RTO are here.

2. Find the right SQL Server feature for your RPO/RTO goals. For example, if you’re not allowed to lose any data, and you need to fail over in less than 1 minute, then your options include failover clustered instances, AlwaysOn AGs with synchronous mirroring, or synchronous database mirroring. Use Page 2 of the HA/DR worksheet linked in the last post to figure that part out, or watch the HA/DR architecture modules of the Senior DBA Class.

3. Review the current server’s wait stats. To help figure out what kind of hardware you need for tomorrow, look at the pains you’re experiencing today. For example, if you’re suffering from PAGEIOLATCH% waits because you can’t cache enough data in RAM, the next server may need significantly more memory (or faster storage, or time spent tuning indexes & queries.) If you’re suffering from LCK% waits because the apps are holding locks on their side, though, you may not see relief no matter what kind of hardware you throw at it.

4. Benchmark your current hardware’s speed. I’ve written about how to check performance on a new SQL Server, but this is also really useful when you’re transitioning away from an old server. For example, take your backup & CHECKDB job runtimes, and the new server should be able to perform at least that quickly (if not more so, depending on what your bottleneck was in the above step, and which kind of bottleneck you’re trying to fix with the new server.)

5. Get the server’s proposed expiration date.  Servers are like milk cartons: both need an expiration date printed clearly on the outside. Ask the business users, “How long does this server need to last?” They usually say, “Forever,” and start laughing, but once the laughing stops, here’s how to handle the discussion.

6. Ask the developers, “What features are we adding during that time?” I don’t expect them to have specific line-by-line answers about which features they’re adding over the next 3 years, but I do expect them to have rough architecture ideas. Are we going to add some kind of big data analysis projects? Machine learning in the database? R, Python, or Java running inside it? Storing new kinds of data, like tracking every web site click?

7. Ask the business, “How much will load grow during that time?” Are we expecting to acquire our next several larger competitors, or maybe conquer sales in a new country? Or are we just expecting to tread water and only deal with incremental growth?

For all of the above questions, the more specific the answers are, the more precise your server sizing can be.

My new hardware

My new hardware

In reality, they’re very rarely specific.

And that’s fine!

But it just means that your server design needs to be more flexible: we need to design something that we expect will change rapidly over time as we learn more about the answers above. That’s one of the reasons why I really love building new environments as Availability Groups up in the cloud: it’s so much easier to add in new, more powerful replicas, join them to the cluster, and then get rid of the old ones. Your server design can leverage the flexibility of the cloud to handle new features or workloads, and you can throw cloud at performance issues.

How to Set Up Your Own Mastering Class VM

When you buy one of my Live Class Season Passes, you can save a lot of money by building your own VM to follow along with the labs. Or, you might just wanna re-run the labs later to see if you can do a better job, or if you’re still mastering a concept.

You’re going to need:

  • A server
  • The Stack Overflow database
  • The indexes & scripts set up
  • Then ideally, back the database up, and set up an Agent job to rapidly restore the database between labs
  • SQLQueryStress

Let’s talk through each piece.

Building a SQL Server

To get a rough idea of how much hardware to use, let’s look at the AWS EC2 VMs I give the class students:

  • Mastering Index Tuning & Mastering Query Tuning use an i3.xlarge: 4 cores, 30GB RAM, local NVMe SSD storage
  • Mastering Server Tuning uses an i3.2xlarge: 8 cores, 61GB RAM, local NVMe storage – we use more cores & RAM here because we run heavier stress tests, since we’re doing server-level discussions

Each server needs at least 500GB local SSD space to deal with the 350GB Stack Overflow database, index creation space, TempDB, backups, and restores. It needs to be fast storage, too: for a rough idea, you’ll need to be able to restore the ~350GB backup in 15 minutes or less. (I’m just using restore speed as one example here – you’ll be doing lots of storage-intensive tasks, like creating indexes on large tables.)

If you want to check your storage speed before downloading the Stack database, run a test with CrystalDiskMark. You want at least 1,000 MB/sec for sequential reads and writes. As an example, my laptop’s SSD speeds are shown at right, and it would be fine for the labs.

Can you get by with less hardware? Sure, but of course your performance will be different than what we’re seeing in class. I actually don’t think that’s a bad thing – every server out there is different – but just be aware that it’ll pose additional challenges for you if you try something like 4 cores, 8GB RAM. Your index creations will be terribly slow, and you probably won’t be able to keep up in class.

Can you use more hardware? Yep, and as long as you’re not caching the entire database in RAM, you’ll probably still have the same basic challenges that we tackle in all of the classes.

After building the server, install:

Getting the Stack Overflow database

Download the 2018-06 version of the Stack Overflow database: 38GB torrent (magnet.) If you’re not familiar with BitTorrent, here are detailed instructions, but just make sure to get the Stack version mentioned earlier in this paragraph – it’s important so you reproduce demos the same way. Your company may block BitTorrent – many do – but it’s the most efficient way to get a >300GB database out there. There aren’t a lot of file hosts willing to share a file that big, heh. You may need to download it from home.

When you extract it, it’ll expand to a ~304GB SQL Server database that you can attach. There are other smaller versions of the database too, but be aware that you’ll get radically different query plans, and some of the demos won’t work the same way since we’re querying for specific date ranges.

It ships in SQL 2008 compatible format, but you usually wanna learn on the most recent Cardinality Estimator (CE). To set that, in SSMS, right-click on the database, go into Options, and change Compatibility Level to the most current version you’re on – but not SQL Server 2019, which behaves quite differently – hold off on that compat level until SQL Server 2019 is released. I’m updating the classes with more modules for SQL Server 2019’s behavior, but if you try that now, you’re going to get wildly unpredictable behavior compared to the rest of your classmates.

Setting Up the Stored Procs and Indexes

We use the First Responder Kit scripts for performance analysis, and we use the below procs to help the demos along. Create these in the Stack Overflow database:

Don’t install that in production, bucko.

Next, run the index creation scripts below. It’s going to take a while, depending on how fast your lab’s storage is. After creating these indexes, we’ll take a backup of the database so that you can restore it each time to set back to a known starting point – rather than creating the indexes every time you restart labs, which can take tens of minutes.

Back It Up and Set Up a Restore Job

During class, between each lab, the students restore their database to this starting point. For sheer performance, you want to run the backup across 4 files – SQL Server backups & restores can actually go faster when you stripe them across more files, even when writing to the same storage.

Then, set up an Agent job to restore them. Here’s the script I use – just modify the file path locations for your backups. Note that mine also has a step to set Cost Threshold for Parallelism and MAXDOP back to normal defaults for the VM we’re working with in the cloud.

In my example below, I’m doing my restores from E:\MSSQL\BACKUP – you may need to modify your backup path and file names.

Test your restore job to get an idea of how long it’ll take – if it takes more than 15 minutes, you probably want to use a faster VM to be able to keep up in class. If the restore takes, say, 20-30 minutes, you’re going to have a tough time keeping up when you’re doing storage-intensive things like creating indexes.

SQLQueryStress for load testing

SQLQueryStress is an easy, free, open source load testing utility. Download the latest version (zip), and extract it to a folder. You don’t have to run a setup or anything – it’s just a single .exe file that you can run whenever you want to do a load test.

Then, download my load test files and extract those to the same folder. Your folder will end up looking like this:

SQLQueryStress folder

And that’s it! See you in class, and hope you have as much fun as Vadim had.

#TSQL2sday: I Just Wish SQL Server Could Restore a Single Object from Backup.

For this month’s T-SQL Tuesday, Kevin Chant asked us to dream up our fantasy SQL Server feature. I’m sure people are gonna ask for flying cars and gene splicing, and that’s awesome. I’m excited to hear the kinds of things they dream up.

Me? I’ve got a really simple ask.

(When you’re talking to Microsoft people, you have to use “ask” as a noun.)

Today, if you have problems with a single file in your multi-file database, you can do an online file restore like this:

That one file is restored from backup, and you can update it by applying subsequent transaction logs, and then eventually bring it online. You can restore individual (or multiple) pages, too. But the thing is – I rarely need to restore a file, or specific pages.

So my ask is:

I just need a single object, like a table someone dropped, or a stored procedure from a mucked-up deployment that somebody didn’t really save in source control the way they were supposed to. I know, there are gonna be foreign key issues, and I’m fine with the keys being marked as untrusted at that point.

Like it too? Vote for Gonzalo Bissio’s feature request.

What’s It Like to Have a Live Class Season Pass?

I don’t usually post product reviews here on the blog, but Vadim Kulikov (aka BlondeDBA in Slack) recently finished all of the Mastering classes, and he took the time to write a really detailed review. Here it is, unedited:

I heard some folks refer to Brent’s training as “Brent Ozar’s Full Experience”. I really like this description, as it accurately describes the training. For some of us who have been in the DBA industry for some time, training is a mandatory part of the job if you want your skills to remain relevant. However, I noticed that most other trainings that I have taken have one common issue, they help to introduce the subject but you don’t retain much due to the lack of hands-on labs and /or long lecturing often reciting from books-online that provides little practical value. In my opinion, Microsoft is not always the best source of information about SQL Server.

One main thing that sets Brent’s training apart from other is the tremendous practical value. The training is very hands-on. The tools that his team developed are the actual tools Brent uses himself in his consulting business. The class introduces the tools and teaches how to properly use them. You learn how to identify bottleneck(s) and how to develop an approach to come up with the most effective solution for the real-world performance tuning scenarios. I purchased the annual pass and went through all the classes. Below is the list of my favorite things about Brent Ozar’s online training:

1. The class provides hands-on material and demonstrates a comprehensive approach to troubleshooting performance problems and identifying root cause. You will learn how to efficiently (with set time limit) improve the situation or resolve discovered bottlenecks completely.

2. In addition to being technically skilled, Brent is a fantastic instructor. Either its speaking with “Clippy” voice or “taking on a personality” of the SQL Server Engine Optimizer, sharing war stories from his consulting gigs and adding constant humor (at times self-deprecating ?) keeps your attention and remarkably doesn’t get old. I have gone through close to a hundred of hours of Brent’s online training (both free and paid) and often run into the same jokes over and over again and still find them hilarious, which adds levity (Brent does get into complex performance tuning scenarios that can be intimidating) and also helps to remember the material. Brent is very creative with his presentation, often uses analogy with car performance or hospital triage situations which I found to be helpful to understand the subject and retain it for a long time.

3. The classes are well structured and tend to build on previously presented material. So, its recommended to take the fundamental classes first to feel more comfortable with more advanced topics. I would also recommend to go through all the specified pre-requisite training to keep up with the class. While lectures have perfect tempo (not too fast not too slow) but the sheer amount of information presented might be overwhelming for the 1st time. I take notes during the lecture and save the Slack chat as Brent addresses other participant ‘s questions in real time.

4. The ability to watch the recorded class before and after (with purchased annual pass) is by far the best option which sets this training apart from every other I have seen. I replaced my “Audible” with “Brent Ozar” subscription, so I can play the courses on my way to work.

5. The Labs are well-thought-out and could be quite challenging. They reinforce the presented material and also demonstrate the effective use of the First Responder Kit tools. You get to use beefed up VMs with large amount of cores and RAM to bring the lab environment very close to the actual production systems you encounter in the field. I love the option to download the different sizes of StackOverflow database to fit your local environment constraints, so you can practice with the labs at your own pace later on your own server.

Every little aspect of this training appears to be carefully considered and planned. Brent started to record with a green screen behind him, so you don’t have to manage multiple screens when viewing the recorded class later. Little detail but makes a big difference and contributes to a pleasant experience.

6. I purchased the annual pass, so could take all the classes I want multiple times. I love that the class starts so early (6am PST), as it give me an opportunity to do at least the first part of my training undisturbed. Most of us have full time DBA jobs and training is often considered a luxury, that you are supposed to arrange on your own time, so “it doesn’t interfere with your DBA responsibilities” ? Re-taking classes helps to capture the parts you missed the 1st time due to the pressures of your job.

7. Another valuable aspect of Brent Ozar training is that you get to ask questions, at times outside the context of the actual class. The feedback you get – provided you can concisely articulate your issue – is very helpful especially if you can’t afford to hire Brent as a consultant.

8. There are many Senior level DBAs who enter the training and contribute to the class by asking interesting questions or share their own experience deploying a particular solution. You can benefit from someone else’s success or failure. It also helps to see how your skills measure up to other DBAs.

9. Brent is constantly improving his content factoring-in students’ feedback. So, you benefit by taking the same class multiple times. I would, however, wish that the classes’ schedule could be a little bit more aligned with the progression of difficulty of a particular class, so you start with basic like “Fundamentals of Index Tuning ”, then progress to “Mastering Index Tuning” for example, and then followed by the most challenging “Mastering Server Tuning”

I highly recommend the training. You will be amazed how much you didn’t know. It’s kind of embarrassing …?

Wow! Thanks, Vadim – I’m honored that you took the time to write this out. Wanna find out what it’s like for yourself? Check out the Live Class Season Pass, and you can also save a lot of money by building your own lab server. Enjoy!

DBA Training Plan 19: The L-Word, Licensing

You’ve gotten through a lot of this training. You’ve tried fixing the problem with indexes, query tuning, and judicious application of various configuration switches. You’ve been watching webcasts and using free scripts, but…

Things just aren’t getting better.

Before you start browsing your cloud vendor’s sizing price list or your favorite server vendor’s build-and-price site, we need to talk about the L-word: licensing. It’s a really complex subject, but we’re going to try to boil it down as simple as possible.

You pay by the CPU core. Even if you’re deploying an older version of SQL Server, you still have to buy and pay maintenance on the current version of SQL Server. That means you’re using core-based licensing whether you like it or not. In the good old days, we’d buy the biggest CPUs possible and clap our hands when it sat around at 0% CPU, but those days are over. These days we want as few cores as possible, but that go as fast as possible.

SQL Server Standard Edition is about $2k USD per core, and it’s limited to 24 cores and 128GB of memory. Even if you deploy just 8 cores (like 2 quad-core processors), that’s $16k of licensing. 64GB of memory is really cheap – you’d be crazy to deploy SQL Server on a physical box with anything less than 64GB of memory. Even on virtual machines, keep things in perspective – 128GB RAM is way, way cheaper than a core of licensing when you’re licensing VMs individually with Standard Edition.

SQL Server Enterprise Edition is about $7k USD per core, but much like Brent Ozar Unlimited, it has no limits. (Okay, actually, both us and Enterprise Edition have some limits.) Enterprise Edition adds a lot of the cool features you really want, like online index creation & rebuilds, readable secondaries in AlwaysOn Availability Groups, and Transparent Data Encryption. However, at $7k per core, you need to be really, really careful about how many cores you buy – most of the time, folks buy way too many cores and don’t spend enough on memory or storage throughput.

Virtualization – if you have less than 4-5 VMs, you’re usually better off buying Standard Edition at the guest level. Once you have more than 4-5 VMs, buy two (or more) virtualization hosts dedicated just to databases. License Enterprise Edition at the host level, and you can run as many guests as you want on those hosts. However, you need to buy Software Assurance (it’s like maintenance) so that you can use License Migration and move your VMs around from host to host. Useful in case one of your hosts fails.

Containers are just like VMs, which makes the whole container deployment thing a really awkward discussion. Microsoft is all, “Just use a bunch of containers!” but you either have to track their licensing individually with Standard Edition, or take every container host where your containers run, and license ’em all with Enterprise Edition. Come true-up and auditing time, you’ll have a lot of interesting questions around, “So, how many SQL Server containers have you been running, and on which hosts?” The easiest answer is to have a dedicated Kubernetes cluster where your SQL Server containers run, and license those hosts with the $7K/core SQL Server Enterprise Edition. When I say the “easiest answer,” I’m referring to your job and mine. The poor finance people, on the other hand, those folks are screwed when they have to write that check. Because of that, I have a tough time selling people on containers for production servers – unless they’re getting free site licenses from Microsoft in exchange for publicizing containers. Those folks seem to be all over it. Go figure.

SQL Server Developer Edition has the same capabilities as Enterprise Edition, but it’s not to be used for production. It’s free, so the rules for virtualization are a little different here. You don’t want to intermingle your Developer Edition VMs on hosts that you’re licensing with SQL Server Enterprise Edition because you’d just be wasting licensing fees. Put these Developer Edition VMs in with the rest of your servers.

Everything’s negotiable if you’re big. If you’re a service provider or a large enterprise, you may have different types of licensing agreements like SPLA or EA. Have their people take your people out to golf (I hear that’s how this stuff works), and people get drunk, and contracts get signed.

SQL Server Licensing GuideIf the licensing costs scare you, that’s where cloud alternatives to SQL Server come in. All the cloud vendors will rent you licensing by the hour, included with your VM pricing. It’s not to say that’s cheaper – over the long term, if you have predictable needs, it’s not – but it can let your accounting team classify the expense differently, and sometimes that’s a good thing.

For the gritty details, check out Microsoft’s SQL Server Licensing Guide PDF. It’s not short, but it’s definitive, and it’s pretty easy to read. If you’ve got any interpretation questions, you have to ask your Microsoft contacts, not folks in the community, and not salespeople. Nobody else can give you answers that will stand up when the auditor comes knocking on your door.

Comments are disabled on this post. That’s because in the past, I’ve learned that anytime I post about licensing, I get dozens of licensing questions in the comments. For licensing questions, contact your Microsoft sales representative. I’m not doing their work for free, ha ha ho ho.

DBA Training Plan 18: Managing Concurrency

Locking isn’t a problem – queries are gonna lock data. As long as nobody else wants the lock, you can take your sweet time locking it, and it won’t show up as a problem. (That’s one of the many reasons queries work so well in development, and then hit performance issues in production.)

Blocking, now that’s where the problem shows up. Blocking means that someone is holding a lock, and they’re blocking someone else from getting the data they need. Even worse is “deadlock” – that term strikes fear in the hearts of database administrators. Deadlock issues strike without warning, kill transactions dead, and slip away without a trace. Because they’re so transient, they’re hard to troubleshoot.

Good news, though – this stuff really isn’t hard to track down once you use a methodology. Here’s mine: first, identify if blocking is slowing things down.

  1. Run sp_BlitzFirst @SinceStartup = 1 to see if LCK% waits are a significant problem for you, or if your server is bored. Note the number of lock waits – if you’ve only had a few dozen over the course of a day, then it might have just been a few queries held up when someone was trying to create or rebuild an index. If you see hundreds (or heaven forbid, millions or billions) of these waits per day, especially with long (seconds or more) average wait times, then we’ve probably got a problem.
  2. Run sp_WhoIsActive and see if LCK% waits are showing up for queries running right now. This will help you identify which queries are involved, and what databases they’re running in.
  3. Finally, run sp_BlitzIndex @GetAllDatabases = 1 and look for “Aggressive Indexes” warnings pointing to the tables that are involved in the blocking. These indexes/tables aren’t necessarily the problem – they’re just involved with the blocking, just like you are. We wouldn’t fire you just because you’re involved with the blocking. (Well, maybe we would.) Interpreting these Aggressive Indexes warnings can be a little tricky (just like interpreting those Human Resources warning emails you keep getting.)
Mitch Bottell is preparing to take an exclusive lock on some BBQ.

Mitch Bottell is preparing to take an exclusive lock on some BBQ at SQL Saturday Sacramento.

Once you’ve identified how big the problem is, which queries are involved, and what tables & indexes they’re trying to lock onto, then we can start fixing the root cause of the problem. There’s three ways to do it, from easiest to most work required:

  1. Tune your indexes – get rid of the indexes you don’t need, and add the right indexes to help queries jump in and out quickly without holding huge locks to do table scans.
  2. Tune your queries – get rid of implicit conversions, functions in the WHERE clause, and other SARGability foes that cause your tables to scan an entire index rather than doing a seek. The less data we can read & lock, the more predictable and small our locks become.
  3. Change your isolation level – let queries cooperate with RCSI or SNAPSHOT isolation instead of waiting in line for locks. That last one is the easiest in theory, but just make sure you read that entire post (and the linked ones in it) to understand what you’re getting into, since it can backfire.