Blog

Execution Plans, SQL Server 2017, Statistics

Warm fuzzies

I think this is how my mom would have felt if I went to college.

You see, once you stop cargo culting around index fragmentation, and disk queue length, you start to realize which things are actually important to query performance.

Things like statistics, and cardinality estimation. And, sure, having the right indexes can help with that, too.

Annoyance in chief

Microsoft, in the past, has surfaced Proustian levels of information about indexes, and fragmentation, and they published numbers at which you should reorganize and rebuild your indexes on those 5200 RPM drives to keep disk queue lengths low.

Or something.

This has led to an astounding, if not absurd, number of questions about index maintenance.

It doesn’t help that Microsoft hasn’t budged from the 5%/30%/1000 pages numbers, the same way they haven’t budged from the MAXDOP/Cost Threshold for Parallelism defaults despite modern single CPU sockets that have more cores than entire server rooms had when the defaults were decided on.

Finally

More information about statistics has started to get added and surfaced. Helpful information!

Starting with the vNext DMV sys.dm_db_stats_histogram.

Now, something else cool got added. Statistics information resides in query plans, and not just when you use a special Trace Flag.

Now, it’s not quite as verbose as other statistics information that you can get with certain Trace Flags turned on (2363 if you’re on 2014+, 9202/9292 previously), but it’s a real good start.

You can immediately get an idea of how stale your statistics are via the last update and modification values, and how high or low your sampling rate is.

And yes, I’ll be surfacing this in sp_BlitzCache. Just, you know, you’re gonna have to upgrade to see it.

Won’t you please just upgrade?

Hugs

I wanna hug whoever added this.

Fair warning: if I find out who you are, and you’re at PASS, the hug is happening.

Thanks for reading!

Will SQL Server Always Do Index Seeks on Numbers?

Last Updated May 11, 2017

Indexing

I often get the question of “Will SQL Server do an index seek with ____?”

Another way to phrase it is, “Is this query sargable?” We throw the term “sargable” around to mean that SQL Server can do an index seek to satisfy your search arguments. (We’re specifically talking about index seeks here, not scans – not that either one is necessarily better than others in all cases.)

The best way to show it is with a series of queries. I’m going to use the Stack Overflow Users table from the SQL Server data dump download, the same table I use in my free How to Think Like the Engine class.

The StackOverflow.Users table is really simple – it holds exactly what you think it holds – and it has fields with a few different data types:

Age – int
CreationDate – datetime
Location – nvarchar

We’ll create indexes on each of those fields to start with. This way, when I’m searching for those specific fields, I may be able to get an index seek:

CREATE INDEX IX_Age ON dbo.Users(Age);
CREATE INDEX IX_CreationDate ON dbo.Users(CreationDate);
CREATE INDEX IX_Location ON dbo.Users(Location);

CREATE INDEX IX_Age ON dbo.Users(Age);

CREATE INDEX IX_CreationDate ON dbo.Users(CreationDate);

CREATE INDEX IX_Location ON dbo.Users(Location);

Now let’s try a few queries and see what happens.

Simple integer comparisons

When we do an equality seek on Age, like WHERE Age = 21, of course we get an index seek. But what about when we CAST Age as a tinyint or bigint?

All three of those get an index seek on our IX_Age index. Yay!

Let’s make things a little more difficult: let’s look for Age = 3 * 7.

AWESOME. SQL Server is smart enough to do the math first, and even comes up with the right row estimates! Love it.

What if we use a function that isn’t exactly predictable, like getting the number of minutes after the hour:

SQL Server still does an index seek. Isn’t that awesome? I think it’s awesome. I’m easily impressed, though.

But what if we pass in a string?

When we pass in a different data type, SQL Server does implicit conversion. It guesses what it should convert.

Implicit conversion in your favor, collect $200

SQL Server is smart enough to realize that it can just convert your string over to a number. If you hover your mouse over the index seek in the plan, it’s also got the right number of estimated number of rows, too.

The same thing happens if you pass in a string variable, too:

It turns out that SQL Server is really good with integers. On this monster data type conversion chart in Books Online, almost everything converts beautifully when you’re comparing it against an integer field:

Looks like a really bad game of Othello.

What if we pass in…a date?

Things start to fall apart in the blue squares, though. Check this out:

DECLARE @MyDatetime DATETIME = GETDATE();
SELECT COUNT(Id) FROM dbo.Users WHERE Age = @MyDatetime;

1 2	DECLARE @MyDatetime DATETIME = GETDATE(); SELECT COUNT(Id) FROM dbo.Users WHERE Age = @MyDatetime;

Believe it or not, that query is technically, morally, and legally valid:

You got your datetime in my peanut butter

Yes, we’re comparing a datetime to an integer.

And ironically – this returns no rows, but it does an index scan rather than a seek. SQL Server upconverts everyone’s age into a datetime! If you hover your mouse over the SELECT, surprisingly, the warning doesn’t say anything about your lack of sobriety.

I know what you’re thinking: how could you possibly convert an integer to a date? Well, it’s quite easy:

Yes, for SQL Server, the number 26 is the same as January 27th of 1900.

So let’s say – just theoretically – that you had a valid reason to pass in a datetime from end users, and compare that to an integer field. Could you make this query do an index seek rather than an index scan? Absolutely – we just have to do our own explicit conversion rather than relying on SQL Server’s implicit conversion. The highlighted part of the query is the key:

Voila – now I get an index seek, which is good. However, hovering my mouse over the index seek returns something curious – SQL Server expected tens of thousands of rows to come back:

But no rows are returned because Methuselah doesn’t have a Stack account. YET.

Which brings up an interesting point.

There’s two parts to this sargability thing.

First, can SQL Server take your search arguments and turn them into an index seek rather than a scan?

Second, can SQL Server use your search arguments combined with statistics to make a good estimate on how many rows will come back? To learn more about that, watch our video class Statistics: SQL Server’s Guessing Game. Enjoy!

Group Post: If I Took Another DBA Job, My First Question Would Be…

Last Updated May 10, 2017

Production Database Administration, Professional Development

“Question 1: how do you predict how long a rollback will take?”

Brent says: “How many people are in the on-call rotation with me, and can I give them each a technical test?”

Once you get out of the on-call rotation, it’s really hard to go back to the electronic tether. I’d be willing to do it, but only if there are other sharp people to share the burden with me. The questions on the test wouldn’t be trivia – I would just want to hear more about their troubleshooting process, learn what actions they were comfortable taking, and know that I could trust them to make situations better rather than worse.

Bonus points if they have a good change control process that avoids random changes to production servers, thereby ruining my drinking night because some yo-yo wanted to deploy a new version of their non-critical app and suddenly adds an N+1 bug that knocks out services to unrelated apps.

Erik says: “What’s the oldest version I’d have to support, and what are your upgrade plans?”

Back when I started working here, SQL Server 2014 was still pretty new. It also didn’t have what I’d call anything “groundbreaking” in it. There were some okay new doodads, but I didn’t point in awe like “THIS WILL SOLVE SO MANY PROBLEMS”.

That’s not true for SQL Server 2016, and oh-my-dear-sweet-Robert-Smith-wig, it is ever not true for SQL Server vNext. The stuff in those two releases is heart and mind changing. Like, I think people will actually hate Microsoft less because of what they’re doing these days. Maybe. As long as they leave the Start menu alone.

Knowing what’s available in newer versions makes supporting and troubleshooting (especially performance troubleshooting) older versions downright annoying for FTEs.

Tara says: “How often does the on-call DBA get called during his/her week of on-call duty? What is the most common thing that the on-call DBA gets called for in the middle of the night? Which monitoring tools do you have in place?”

You know why she’s smiling? She’s not on call.

Like most production DBAs at large companies, I’ve been through horrific on-call weeks where you don’t get much sleep. As long as the bad weeks don’t happen too often, you can manage it. When every week is a bad week, you may start looking for a new job.

I’ve been in on-call rotations where there are a few hundred SQL Servers. It was rare to not get called at least once while you were sleeping during your on-call week. Most of the late night calls were due to disk space issues. Most of those should have been handled during the day. It’s imperative that the production DBAs be proactive to limit the on-call work. The on-call team needs to have enough time in their day to do proactive work. If they don’t have extra time, then the team is not sufficiently staffed.

Monitoring tools are a critical component of production environments. If I get woken up at 3am for a performance issue, I want to have monitoring data that I can look at to help solve the problem quickly so that I can go back to bed. Even if your company can’t afford a really nice monitoring tool, you can create your own. At the very least, log current activity to a table with sp_WhoIsActive. There are lots of other free tools.

Richie says: “Who would I be reporting to, and what level of interaction would my manager have with the team that I’m on?”

No, Richie, we’re not pulling your finger again

Actually, that’s not true. My first question would be “Why in the hell am I taking a DBA job?” If you are unaware I’m the only developer on the team (official title: dataveloper) so for me taking a DBA job would be a bit out of my comfort zone. So let’s change DBA to developer.

The most important factor for me taking a new gig is who would I be working for. The manager literally defines success for their employees. If you get a good one, they’ll create an environment that encourages your success. If you get a bad one, they’ll give you unrealistic deadlines, change the definition of success, and blame you for their failures.

Modern corporate software development is also team undertaking. I may have one of the few jobs where I’m working on software all by my lonesome. The success of a project isn’t defined by your individual contributions but the completion of the team goals. I’ve been with some companies where my manager had no interaction or control over the team/project that I was a part of. When that happens the manager is getting information about your performance second hand without understanding the full circumstances. Like a child’s finger painting, the performance review becomes an ambiguous, inaccurate mess.

In the end get a manager that you can trust. I’ve worked fourteen hour days for two straight months for a manager that I trusted. I’ve also had a manager that I refused to be in a room alone with him. I think that’s why it was so easy to join Brent Ozar Unlimited. I’ve known Brent for a long time and knew I could trust him. I knew he would be honest with me and let me know when there was a problem. I knew he would create the environment for my success. Please don’t tell him that I said any of this. He’d probably have me walk over hot coals if he knew.

What about you? What would your first question about the job or company be?

Building a Faux PaaS, Part 1: The SQL Server DevOps Scene in 2017

Last Updated June 8, 2017

Architecture, kCura Relativity

In the cloud, treat your servers like cattle, not like pets.

In the cloud, systems administration is very different than the on-premises stuff you’re used to. When you build VMs in the cloud with Infrastructure-as-a-Service (IaaS, meaning AWS EC2, GCE, or Azure VMs), you expect them to die. It’s just a matter of time. If you’re lucky, it’ll be years from now, but if you’re unlucky, it’ll be tomorrow.

This sort of thinking drove Netflix to create the Chaos Monkey. In their 2010 post 5 Lessons We’ve Learned Using AWS, they wrote:

“One of the first systems our engineers built in AWS is called the Chaos Monkey. The Chaos Monkey’s job is to randomly kill instances and services within our architecture. If we aren’t constantly testing our ability to succeed despite failure, then it isn’t likely to work when it matters most?—?in the event of an unexpected outage.”

That’s right: they have a tool that randomly terminates production instances.

Imagine being an admin in an organization that runs the Chaos Monkey. You start to think completely differently: that precious server you’re about to put into production was born to fight a villain enemy out to kill him. The enemy will win the battle, but not necessarily the war.

You have to be able to lose soldiers – individual servers – at any time, but design your infrastructure in a way that the entire application, whatever service you’re providing, will win the war overall.

That’s why cloud sysadmins have started treating infrastructure as code.

Don’t think of VMs as servers you build.
Think of them as applications you deploy.

In companies like this, from the moment the VM powers on, everything you do to get it ready for production needs to be scripted, repeatable, and eventually, automated to the point where it happens without human involvement. And when you’ve got this much scripting in play, that also means controlling the source code just like you would an application’s code.

You rarely see Chaos Monkeys around database servers.

Database administrators have always treated our servers like the most special of pets. We pick just the right breed, we give them special names, we train them very carefully, we teach them tricks, and we build a personal bond with them. When they die, we’re heartbroken, and we feel like we have to start over from scratch.

Most SQL Server shops big and small simply aren’t prepared to do all of this in an automated fashion:

Deploy Windows
Install & configure failover clustering so the server can join an AG
Install SQL Server and configure Always On High Availability
Join the right cluster
Install all your utility stored procedures, backup jobs, Agent alerts
Restore the relevant databases
Join the right Always On Availability Group with the right replication type (sync vs async)
Modify the read-only routing lists on all the replicas

This stuff is hard to do manually, let alone automatically. Therefore, we think of our servers as precious hand-crafted pets rather than cattle we could just lose at any time without heartbreak.

That’s what a Platform-as-a-Service does for you.

Azure SQL DB and Amazon RDS do all this stuff for you at the swipe of a credit card. They take away the plumbing parts of database administration that suck: building, backing up, patching, corruption repair, etc. For any new app builds today, I recommend thinking about PaaS first. (It’s what we use for our own development.)

But for existing apps, PaaS has a few showstoppers. Let’s take kCura Relativity, an app I’ve blogged about before. Here’s why they can’t just switch to PaaS hosting:

Missing features – for example, Relativity relies on linked server queries. Amazon RDS has a kind of sketchy implementation, and Azure SQL DB requires you to pre-define individual table structures. Neither of those work with the way Relativity is built today, and the code changes required to support either platform would be fairly expensive. (That’s not the only one, obviously, but I’m keeping the examples simple for this post. I can already hear the armchair architects going, “just tell the stupid developers to rewrite their stupid app.” That ain’t how the real world works.)
Capacity/performance limitations – Azure SQL DB maxes out at 4TB per database, and Amazon RDS maxes out at 30 databases per instance. Both of those present problems for Relativity.
Exporting raw data – Relativity users want to be able to sync on-premises versions of the data, sometimes migrating from on-prem to the cloud, and sometimes migrating back out. With PaaS, this involves long outages that aren’t acceptable to attorneys in the middle of frantic case review.

So in this case – as I’ve seen with a few other clients – PaaS isn’t quite ready today to handle the challenges of a global ISV with a mature, profitable application. They simply can’t hit the pause button on building new features, and take time out to do an expensive back end rewrite. (Although in the case of kCura, they’re moving some parts of the data out of SQL Server where it makes sense, thereby making the app easier to handle in PaaS-like environments.)

Hmm. If Azure SQL DB and Amazon RDS aren’t a good fit yet, but we want bulletproof reliability and automatic scaling, and we gotta use the boxed product (Microsoft SQL Server), what do we do?

What would SQL Server need to go up against the Chaos Monkey?

What if, at the push of a button, we could deploy a VM with the right Windows config, set up clustering, get SQL Server installed correctly, restore the right databases, and join an Availability Group?

Sure, that would help us defend against the Chaos Monkey because if any one instance disappeared, we’d be able to rapidly stand up its replacement just by hitting the button. The other AG members would cover in its place for a while.

Even better, what if we mounted that button in a place that anyone – or anything – could push? What if we enabled our monitoring tools to push that button for us when things were going wrong?

That button would be good for a whole lot of bonus stuff.

That button could completely change how we do patching, for example. Rather than patch an existing server, just define (in code) the appropriate patch levels, and hit the button. Let the system stand up a new replica with the right patch levels. When it’s good to go, simply fail over to it, and then delete the old replica. Cattle, not pets – that cow did a good job, but it’s not needed anymore, and we can just make it go away.

The button would change how troubleshooting works. Having problems with a janky replica? Can’t figure out why it’s failing or throwing strange errors? Just hit the button, and a new one appears to replace it.

SQL Server DBAs have a hard time with this button.

We’re so attached to our pets that we have a hard time saying, “Aw, screw it, just stand up another SQL Server and kill that troublesome one.”

We need to flip things around:

Ops/cloud admins are all about putting a lot of work into building that button, and then hitting that button as often as necessary.

That button is what kCura’s Mike Malone termed a Faux PaaS: it’s like Azure SQL DB’s Platform-as-a-Service, but something you build and manage yourself. It’s a SQL Server service that stands a much better chance of overall success against the Chaos Monkey. Sure, you could build this kind of thing on-premises, but the cloud’s easier deployment APIs make this more feasible.

I’ve had clients build this kind of thing privately, but I’m excited that for the first time, I’m on a project that I can talk about publicly. Over the coming weeks, I’ll talk about why you might build something like this, design considerations, and common gotchas. Next week, we’ll cover choosing the right instance types, storage, and backup location for your RPO/RTO goals. Along the way, I’ll even share new open source tools with you to make the journey easier.

Continue Reading Part 2:
Choosing and Testing a Cloud Vendor

The Case of Entity Framework Core’s Odd SQL

Last Updated May 11, 2017

Richie Rump

Development

Recently, Julie Lerman contacted me about some strange SQL that Entity Framework Core (EF Core) was generating. Now, EF Core is fairly new and works with Microsoft’s newest framework .NET Core. I haven’t had a chance to get my hands on EF Core yet so I was excited to see what it was doing. And of course when I saw the SQL my jaw hit the floor. Here’s an example of the SQL she sent me to accomplish an insert:

exec sp_executesql N'SET NOCOUNT ON;
DECLARE @toInsert0 TABLE ([IsPointy] bit, [Name] varchar(250), [_Position] [int]);
INSERT INTO @toInsert0
VALUES (@p0, @p1, 0),
(@p2, @p3, 1),
(@p4, @p5, 2);

DECLARE @inserted0 TABLE ([WeaponId] int, [_Position] [int]);
MERGE [Weapons] USING @toInsert0 AS i ON 1=0
WHEN NOT MATCHED THEN
INSERT ([IsPointy], [Name])
VALUES (i.[IsPointy], i.[Name])
OUTPUT INSERTED.[WeaponId], i._Position
INTO @inserted0;

SELECT [t].[WeaponId] FROM [Weapons] t
INNER JOIN @inserted0 i ON ([t].[WeaponId] = [i].[WeaponId])
ORDER BY [i].[_Position];

',N'@p0 bit,@p1 varchar(250),@p2 bit,@p3 varchar(250),@p4 bit,@p5 varchar(250)',@p0=1,@p1='Katana',@p2=1,@p3='Kama',@p4=0,@p5='Tessen'

exec sp_executesql N'SET NOCOUNT ON;

DECLARE @toInsert0 TABLE ([IsPointy] bit, [Name] varchar(250), [_Position] [int]);

INSERT INTO @toInsert0

VALUES (@p0, @p1, 0),

(@p2, @p3, 1),

(@p4, @p5, 2);

DECLARE @inserted0 TABLE ([WeaponId] int, [_Position] [int]);

MERGE [Weapons] USING @toInsert0 AS i ON 1=0

WHEN NOT MATCHED THEN

INSERT ([IsPointy], [Name])

VALUES (i.[IsPointy], i.[Name])

OUTPUT INSERTED.[WeaponId], i._Position

INTO @inserted0;

SELECT [t].[WeaponId] FROM [Weapons] t

INNER JOIN @inserted0 i ON ([t].[WeaponId] = [i].[WeaponId])

ORDER BY [i].[_Position];

',N'@p0 bit,@p1 varchar(250),@p2 bit,@p3 varchar(250),@p4 bit,@p5 varchar(250)',@p0=1,@p1='Katana',@p2=1,@p3='Kama',@p4=0,@p5='Tessen'

If you’re an experienced SQL tuner, you’ll notice some issues with this statement. First off the query has not one but two table variables. It’s generally better to use temp tables because table variables don’t have good statistics by default. Secondly, the statement uses a MERGE statement. The MERGE statement has had more than it’s fair share of issues. See Aaron’s Bertrand’s post “Use Caution with SQL Server’s MERGE Statement” for more details on those issues.

But that got me wondering, why would the EF team use SQL features that perform so poorly? So I decided to take a closer look at the SQL statement. Just so you know the code that was used to generate the SQL saves three entities (Katana, Kama, and Tessen) to the database in batch. (Julie used a Samurai theme so I just continued with it.)

A Deeper Look

SET NOCOUNT ON;
DECLARE @toInsert0 TABLE ([IsPointy] bit, [Name] varchar(250), [_Position] [int]);

1 2	SET NOCOUNT ON; DECLARE @toInsert0 TABLE ([IsPointy] bit, [Name] varchar(250), [_Position] [int]);

The first part of the SQL creates a table variable called @toInsert0. This table variable will have the values that we want to insert into the Weapons table. It also adds a column called _Position, this will keep the order that the row was inserted in. This is needed later so that the WeaponId can returned in the order it was received.

INSERT INTO @toInsert0
VALUES (@p0, @p1, 0),
(@p2, @p3, 1),
(@p4, @p5, 2);

INSERT INTO @toInsert0

VALUES (@p0, @p1, 0),

(@p2, @p3, 1),

(@p4, @p5, 2);

The next bit is pretty straight forward. This inserts the data of the three entities into the @toInsert0 table variable. The @toInsert0 will be later used to insert the data into the Weapons table.

DECLARE @inserted0 TABLE ([WeaponId] int, [_Position] [int]);

1	DECLARE @inserted0 TABLE ([WeaponId] int, [_Position] [int]);

A second table variable, @inserted0, is created but this time with WeaponId and _Position as columns. This table will hold the new WeaponId’s that will be generated by inserting the data into the Weapon table.

MERGE [Weapons] USING @toInsert0 AS i ON 1=0
WHEN NOT MATCHED THEN
INSERT ([IsPointy], [Name])
VALUES (i.[IsPointy], i.[Name])
OUTPUT INSERTED.[WeaponId], i._Position
INTO @inserted0;

MERGE [Weapons] USING @toInsert0 AS i ON 1=0

WHEN NOT MATCHED THEN

INSERT ([IsPointy], [Name])

VALUES (i.[IsPointy], i.[Name])

OUTPUT INSERTED.[WeaponId], i._Position

INTO @inserted0;

This is where the magic happens. EF Core is using the MERGE statement to insert data from the first table variable (@toInsert0) into the Weapons table. The join that it is using is 1=0, which will always be false. So this statement will always run the WHEN NOT MATCHED section and insert the data into the Weapons table. Then, the OUTPUT line will put the newly created WeaponId and _Position into the @inserted0 table variable. This is necessary in order to return the newly created WeaponId to the client. In this case, WeaponId is an IDENTITY column which means SQL Server will automatically assign a value for WeaponId upon insert.

SELECT [t].[WeaponId] FROM [Weapons] t
INNER JOIN @inserted0 i ON ([t].[WeaponId] = [i].[WeaponId])
ORDER BY [i].[_Position];

SELECT [t].[WeaponId] FROM [Weapons] t

INNER JOIN @inserted0 i ON ([t].[WeaponId] = [i].[WeaponId])

ORDER BY [i].[_Position];

The final statement returns the newly created WeaponId(s) to Entity Framework. A few things to note: you could rewrite this statement to exclude the join and just select the data from the @inserted0 table variable, but I understand why they did it. What they’re doing is insuring that the data returning actually exists in the Weapons table. The other thing to note is the ORDER BY. In general, we want to avoid ORDER BY but the EF team needed to guarantee the order that was returned is the same order that was inputted so that the WeaponIds can be assigned to the correct weapon entity.

With a bit of testing, I realized that this statement wasn’t only being used for batch requests, but also for entity inserts with more than one entity being saved. Also, it appears that the limit per statement is 2,000 attributes. For example, if 6,500 attributes are being initially saved to the database EF Core will create four of these statements: three statements of 2,000 attributes and one statement of 500 attributes . The number of entities that are saved in each statement depends on the number of attributes being saved. In this case, 1,000 entities were saved per statement for a total of 3,250 entities saved to the database.

Rationalizing It All

Now that we’ve taken a look at the generated SQL, shouldn’t we be enraged? Wouldn’t we rip our devs a new one if we saw table variables, merge statements, and unnecessary sorts in their SQL? But let’s take a look at the core problem that the EF team is trying to solve: performance. Yea, I said it. In previous versions of Entity Framework, data was inserted one statement at a time. So if you needed to insert 5,000 rows into a table, you got 5,000 insert statements. This was necessary because the application needed the SQL-Server-generated identity value. But with this new approach, you could insert 5,000 rows of data with one statement.

But wouldn’t it be faster with temp tables? Sure! But you can’t guarantee a temp table name would be unique in your session (because you may have done other things along the way.) You could name a temp table with a guid, but there’s no guarantee that a guid is unique either. So the safest route would be a table variable and not a temp table.

But why use the MERGE statement if it’s so buggy? I think this goes back to the performance question. We could use n number of inserts but that’s not solving the performance problem that we originally had. In short, I think I’m OK with the MERGE statement. It’s only being used for inserts, and it’s faster than 5,000 insert statements.

But what about that ORDER BY in the SELECT statement? You kinda got me there. Could they be doing the sorting in the code but they went the SQL route. But it’s not so bad, since they’re creating a new SQL statement for every 2,000 entities SQL Server will only be sorting 2,000 rows.

To decide if it’s a bad thing, let’s compare the older version of Entity Framework (EF 6) with the new version (EF Core). In my totally unscientific tests, saving 10,000 entities to the database using the EF 6 took 4.06 seconds while saving the same 10,000 entities with EF Core took 1.46 seconds. So yes, the SQL isn’t perfect, but it’s 248% faster.

Brent says: hoo, boy. This is…not ideal.

[Video] Office Hours 2017/05/03 (With Transcriptions)

Last Updated May 7, 2017

Brent Ozar Unlimited Team

SQL Server, Videos

This week, Brent, Tara, and Richie discuss choosing batch sizes, SQL 2017 on Linux, how much memory to leave on the OS for the SQL server, parameter sniffing, allocation unit size of SQL Server, triggers, X-Files, aliens and much more!

Here’s the video on YouTube:

Office Hours Webcast - 2017/05/03

Watch this video on YouTube

You can register to attend next week’s Office Hours, or subscribe to our podcast to listen on the go.

Enjoy the Podcast?

Don’t miss an episode, subscribe via iTunes, Stitcher or RSS.
Leave us a review in iTunes

Office Hours Webcast – 2017-5-3

How should I update 50 million rows with minimal impact?

Brent Ozar: Jam says, “Do you guys have a good blog post that I can use to update 50 million rows in a SQL Server table? Should I use like the top clause?” I guess a bigger – I wonder like how many rows are in it, if you’re trying to update all of them or if it’s a percentage, what else would you ask?

Richie Rump: Indexes? How many indexes are on there and constraints?

Tara Kizer: Yes, do you need to drop them? I’ve had to do these at EMS, I don’t know if we have a blog post on the website or not but I’ve had to do massive data changes at times. You might want to break it up into batches and use top clause and keep going until you’re done. That’s going to run slower but the idea is to run slower at times so you don’t cause so much blocking. But if you have a maintenance window and if it will fit in the transaction log as one big transaction, you could do it as one, but I don’t think I’d recommend it as one just because it’s so many rows and in case it does rollback you don’t want to be you know, having to wait a long time. So make sure you test it somewhere, but yes, even if you’re in a maintenance window and if your system is 24 by 7 and you don’t want to cause too much blocking, you can break it up into say, 10,000-row chunks. That’s usually what my batch size is, 10,000.

Brent Ozar: Michael J. Swart has a great post on picking your batch sizes, if you search for Michael J. Swart and batch sizes, he’s got a great like AB test graph where he shows all kinds of different batch sizes. The idea isn’t to use his sheet to find the right number for you; it’s to use that methodology to find the right number for you. I’m with Tara, I just use like 5000 or 10,000 and I call it a day.

Can you use Azure SQL DB for your HA/DR?

Brent Ozar: Trisha says, “Is it even possible to…” Aliens. “Do an HADR type solution using built-in functionality between an on-premises server and something like Azure SQL DB?”

Tara Kizer: Availability groups, right? You can…

Brent Ozar: No.

Tara Kizer: You can’t? Is that only when you do the other way?

Brent Ozar: You have to build a VM. So if you build a VM you can, but they don’t let us do a replica up in Azure SQL DB yet and I know somebody in their grandma’s basement – no offence to Tara, nothing to do with – she’s at her mom’s, but somebody and specifically the basement with no windows…

Tara Kizer: We have no basements in San Diego.

Brent Ozar: That’s true.

Richie Rump: Or Miami.

Brent Ozar: Chicago actually – where is it that they actually have basements? Somebody in mom’s basement is going to say, you can do replication using transaction replication, but it’s not DR. That’s designed for a one-way move and you don’t come back.

Tara Kizer: On my Brent Ozar Unlimited profile, I think I say replication is not an HA or a DR solution. I’m adamant about that. I don’t care if you’ve used it for HADR, it is not.

Brent Ozar: It is so true. Tara had me at “It is not a…”

Will SQL Server on Linux catch on?

Brent Ozar: Andrew asked an interesting question, “How do you guys foresee the uptake for SQL 2017 on Linux and do you think that Microsoft will now start to attract the Oracle enterprise database market share?” What do you guys think?

Tara Kizer: I don’t know that we’re going to – that’s going to attract the Oracle market share. I don’t know. Maybe the other database technologies out there but Oracle, I don’t know. I don’t see that happening.

Brent Ozar: Richie what’s your thought? Reading tea leaves. Which character is that that you were flying in front of the webcam there?

Richie Rump: That’s Star Lord in the Milano.

Tara Kizer: Don’t even know what that is.

Richie Rump: Guardians of the Galaxy volume two, coming out, actually today and that’s why Balls IM’d me because he had an extra ticket in Orlando and I’m like, I’m not in Orlando, like I thought you were in Orlando, I’m not going to freaking Orlando, but I considered it. I think the Linux thing is interesting, the question is not to go to Linux from SQL Server, but why would someone on Linux go to SQL Server as opposed to Postgres Oracle? So that’s the question rather than anything else. So I mean, are they going to lower the price so now we’re half as price as Oracle, maybe then that’s a question?

Tara Kizer: We are already. We’re cheaper than Oracle because in order to use any feature in Oracle, you have to pay for it, whereas SQL Server, you do that one payment or whatever, enterprise or standard and then you’re done. Oracle, you have every single feature, you have to add, add, add, add. It’s super expensive in Oracle. Even enterprise edition SQL Server is cheaper than Oracle.

Richie Rump: Yes, so now the question is if it’s not the cost, so is it Linux really the big driver of people going to a different database?

Tara Kizer: It is for one of my companies. They were anti-Microsoft, I mean, every desktop was Windows but servers, they wanted all Linux and we were still a bit SQL Server shop but maybe for companies like that, it might make sense but if they’re anti…

Richie Rump: I have not run to one of those companies yet, so – and I’ve been bounced around a lot like a tennis ball, so the question is why wouldn’t I go to Postgres at that point, right? So if I have a Linux machine, why am I not going to Postgres? What features are in SQL Server that are not in Postgres that I could use or some other database engine? I kind of dig Postgres, I’ve been doing it a little bit, I’m doing a little Postgres dance…

Tara Kizer: Not going to the dark side.

Richie Rump: Oh, it’s dark too and it’s kind of musty but – and it’s different, and it’s definitely different, but if I’m looking at SQL Server on Linux, my question would then be why or what’s – why am I doing that? Because you know the first release isn’t going to be as good as it’s going to be on Windows, so why are we doing it? I don’t know, I just don’t see the end game.

Tara Kizer: [crosstalk] either that hardly any of the features are in the product.

Richie Rump: Yes, exactly. I don’t see the end game, maybe in three years I will but I don’t see a reason why me as a CTO would go to Linux on SQL Server yet. The carrot’s not there yet.

Tara Kizer: Wait a minute; did Richie get to promote it to CTO? Did he get a pay raise with that?

Riche Rump: Actually I had to pay Brent.

Tara Kizer: Maybe I should look at that senior network job on LinkedIn.

Brent Ozar: I’m the same way. The only edge case I’ve ever heard about it is with somebody wants – they have an ISV app and they it’s only certified to use SQL Server and the company refuses to use Windows. That’s the only thing I’ve heard. That’s it.

Brent Ozar: Wesley says, “I know why SQL Server on Linux – because I am a SQL Server lover who is a closeted Linux lover, then I can be like, yes I do that at our local user group meetings.”

Tara Kizer: You’ll be the only one.

Brent Ozar: Exactly. What about all your co-workers? Imagine that you got production SQL Servers on Linux and they barely know Linux and they barely know SQL Server and now they’re herp-a-derping they’re way through both.

Richie Rump: Well, I mean look how quickly adoption is for 2016.

Tara Kizer: Yes, that’s true.

Richie Rump: How quickly is adoption going to be for SQL Server on Linux on a completely different platform?

Brent Ozar: And I always want to show that just because it’s so interesting. I’m going to…

Tara Kizer: I can never remember the website address for that.

Brent Ozar: Yes, SpotlightEssentials.com is Quest’s free version of Spotlight and if you click on Collective IQ up at the top, they have the stats for all of the people who are using Spotlight Essentials, so out there in the wild, they’re seeing zero percent 2016…

SQL Server 2016 adoption rate vs SQL 2000

Tara Kizer: It’s not zero, around eight.

Brent Ozar: Just rounding, but then 2% of SQL Server 2000, so yes, that’s – and it’s not – to be really clear, all of us like 2016. I think it’s the best release yet, I would put it into production myself, big huge fan. And 2000…

Tara Kizer: The issue is that starting with 2008, SQL Server is a really rock solid product. We can compete with Oracle as far as performance goes and there’s a lot of features in there, so why upgrade if it’s working?

Richie Rump: Because it’s six years old.

Tara Kizer: Well, I agree, but that’s what customers are thinking – people who are using 2008. I don’t know what our percentage is for clients. I think 2008 R2 for me might be, I don’t know, 40%, somewhere around there. Most around 2012, 2014. I’ve had one client that’s been on 2016. I’m like, uh-oh, I don’t know if I know how to use it.

Richie Rump: It’s so interesting because it’s so different than from the open source community where we’re getting new versions on a weekly basis. I spent almost all day yesterday upgrading a project because I upgraded one piece of open source project and then it cascaded down to all these other projects that I had to update, and I had to update the unit test, I had to go check all this other stuff just because I updated one, you know? And it was because of a security issue. I needed to update it and then all of a sudden I needed to go down that path of updating everything else and that’s just the way it is in open source, and so not updating your servers, which has things like that and getting the new features and all that, it doesn’t make sense to a developer’s head, which is why I think that we have a lot of disconnect between the way DBAs think and developers think.

Can I do cross-platform Availability Groups?

Brent Ozar: Andrew, I’m going to – I see two things in there that tell me a whole lot about who you are and what you’re doing and I’m not going to mention last names but you’ll see why here in a second. You said for Linux availability groups, he says for one cross-platform availability groups except for the only way that that’s useful is for reading. It’s not useful for automatic failover between those, as far as I know, like the last time I looked at it for the CPT2, you couldn’t do automatic failover back and forth between Windows and Linux.

Tara Kizer: Wait a second, so availability groups now will support another server being Linux just for reading?

Brent Ozar: For reading, that was the last thing I saw on CTP2.

Tara Kizer: Okay, because in 2012 and 2014, probably 2016, you couldn’t even have a different version of Windows. Windows 2012 R2 and Windows 2012, that was not allowed.

Brent Ozar: Yes, starting with Win 2016 and SQL 2016, you can for upgrade purposes.

Tara Kizer: But just reading, okay.

Brent Ozar: Yes, and the other thing he says is try docker containers for your apps. He’s talking about apps that he’s building though, that’s the thing is that when you upgrade one part of it.

Richie Rump: Right, so this is all like MPN pack – I’m going to do this in a database thing because that’s what I’m going to do. So it’s an MPN package, one little piece of software. I’m actually running this app server-less. I don’t worry about the OS, I don’t worry about the runtime, it’s all given to me. This is the actual pieces of software that I downloaded from MPN where one little piece started a cascading effect on everything else underneath it and that’s just the way it works, right? Either you don’t upgrade and you stick to where you’re at or you have to continually change your code and update as the pieces of code that you bring into your application updates. I mean, I got a whole talk on this if you want man, I could go off, but I’m not going to do that.

Brent Ozar: Not today.

Richie Rump: No.

Why is my SQL Server rolling transactions forward on startup?

Brent Ozar: Don says, “Good morning, we had a power failure a few weeks ago and I noticed on SQL Server start-up that there were many hundreds of transactions rolled forward on all databases including master and model. Why were there so many transactions rolled forward when it was off hours and no user transactions were involved?”

Tara Kizer: How many had it completed? I mean, start logging your activity to who is active and see what’s going on. If it were my system at one of my jobs, I would immediately suspect a hibernate issue where the application loses track of transactions and is leaving open transactions. So at previous jobs we’ve had it set up automated stop-to-kill transactions because eventually it starts causing tons of blocking, so maybe you’re experiencing – maybe you’re not using hibernate but maybe you’re experiencing transactions being left open and not being committed or rolled back.

I love it.

How much memory should I leave free, and why?

Brent Ozar: Guillermo says, “What’s the correct formula” – we have all kinds of ideas, I don’t know if any of them are correct – “and what’s the correct formula to determine how much memory to leave on the operating system for a SQL Server? I thought 4GB to 6GB for Windows would be enough but I recently heard it should be more on a 380GB of RAM server.”

Tara Kizer: We have a blog article on that. I don’t know if you want to post the link but it says 10% or 4GB, whichever is higher. So for you, that’s 38.4GB left to the Windows and the other processes and then after that, there’s an article that Jonathan Kehayias has over at SQL Skills that he says after you’ve done that, monitor your available megabytes and for a system of 384GB, you just want to make sure the Windows and other processes always have at least 1GB of free. So let it run, after you’ve given 10% back to the OS, let it run for a couple weeks and see what available megabytes are, and so you can possibly can get back more to SQL Server. But right now you need 38, according to the best practice.

Brent Ozar: Great example for why I was working with a guy in our senior DBA class and he said nothing could possible use that much memory and I said, wait a minute, didn’t you just say you had a Fusion-IO drive in there for tempdb? And he goes yes, and I said, you know they use RAM for their drives that keep a map in memory of where all the stuff is? He said it can’t be that big, and we popped open the manual and I said well, based on the size of your drive and the model, it needs 58GB of RAM per drive. Per drive.

Tara Kizer: Wow.

Brent Ozar: Yes, not a small amount of memory, so definitely watch how much memory you have available there.

Richie Rump: Oh my gosh.

Tara Kizer: That’s insanity.

Brent Ozar: Intels don’t do that but SanDisk, the Fusion-IO stuff keeps maps in memory of where your data lives.

Tara Kizer: Man, I had no idea. One of my jobs, we – and that system is still in production, it’s using Fusion-IO and that was a 256GB of RAM and I would have configured it to the 10% best practice, oops.

Brent Ozar: And you really got to dig to find this out. Like you got to go digging through best practice manuals and go hey, how much does it need. It’s based on the NTFS allocation unit size and the size of the drive.

Why is my stored procedure’s query plan changing?

Brent Ozar: Mark says, “Hi guys. I have a stored procedure that sometimes changes its plan based on an unusual parameter passed into it and then it runs badly for normal parameters. I’ve been using traces to see execution time for that stored procedure and then if I see it increasing I clear the plan and let it recreate with the normal parameter.” What should he do differently?

Tara Kizer: Well, I mean why is it happening? What is causing the good plan to go out of the plan cache? I’ve spent hundreds of hours on this topic in parameter sniffing and I put plan guides in place in 2012 and greater to make sure we get the right plan. I know in older versions one of the solutions was to create a job and that would just run the stored procedure every minute or so, so that if the good plan got out of the plan cache, there was a chance for another good plan to get loaded because they had tested the parameter values. But you can’t do that when the stored procedure is using inserts, updates, and deletes. But look into index ins, plan guides, as well as, you know, option recompile. If you can, depending upon how often and query runs.

Optimize for, that’s a really good one. So if you know what parameter values work best when it gets optimized for a certain set of parameter values then optimize for can be a good choice. I like optimize for the best and I have – instead of putting that directly into the stored procedure code, I have put that into a plan guide just because I didn’t have access to the source control, stored procedure stuff and I can put a plan guide on top of a stored procedure.

Brent Ozar: It works beautifully.

Richie Rump: Do you think Stack Overflow had that problem with John Skeet, and they created their own John Skeet stored procedure for that?

Brent Ozar: Yes, there’s so many good stories around that, especially over at Stack. We did some branching code, if the user ID is this, go run this code, otherwise, run this other code. It has to be in different stored procedures for the store. If you want to deep dive way deeper, go to GroupBy.org. GroupBy.org is our totally free community conference, it’s online, the next one is coming up on June 2th and 9th but Guy Glanister last week had – two weeks ago, had a really good session on how to use parameters like a pro and boost performance. It’s like 90 minutes long, so go to GroupBy.org and click Watch Past Sessions, then go to Parameters down there and really good fast paced session. He goes really quick in there.

What’s different about AGs in Azure VMs?

Brent Ozar: Ben says, “My company is about to set me up with an Azure infrastructure services, including” – nothing against – you know, just if they’re going to set you up, like a Porsche or a Ferrari, not like a server – “including always on availability groups on SQL Server 2012. What should I watch out for with always on availability groups and how is Azure different from private Cloud VMs?” If you were going to give somebody a shorter starter guide on what to watch out with for AGs, what would you warn them about?

Tara Kizer: Well, they’re specifically asking about Azure, so I mean, if this were – to me, it doesn’t matter what the – if it’s on-prem or Azure or somewhere else. What I would look for, for availability groups, is making sure I have the cluster configuration properly, make sure I have a witness, make sure that I have the votings set up properly because, in my environments, we’ve had disaster recovery sites, servers participate in availability group at another site, so voting and [inaudible 0:16:54.2] are the two big ones for me as far as making sure you don’t cause an outage due to misconfiguration. But that’s on-prem or Azure or EC2 or GCE, doesn’t matter. I don’t know specifically about Azure, I don’t know if you have something on that brand.

Brent Ozar: Yes, up on the screen, I’ve got – if you go to simple-talk.com, simple-talk.com is Redgate’s community blog thing and they had two – Joshua Feierman, I believe is how you pronounce his name, or Feerman maybe, had two really good posts a couple, few weeks ago about Azure load balancers and Azure networking. It’s way harder to do this – I say way harder, yes, I think it’s way harder to do this in Azure than it is AWS or GCE. You have to create a load balancer and you have to know how their sub-nats work so that the listener isn’t managed by SQL Server, it’s actually managed by Azure. So he has a great two-part write up in there about Azure load balancers.

Richie Rump: Yes, I found that to be true kind of overall in most things in Azure, that AWS is easier to manage than an Azure – it feels like Azure kind of adds in this enterprise layer that for most people, kind of don’t really need.

Does NTFS allocation unit size affect performance?

Brent Ozar: Guillermo says, “What’s the performance impact in SQL Server of an allocation unit size of 4K versus 64K?”

Tara Kizer: I think I asked that question when I attended SQL skills training years ago and I forget which one it was, Kimberly or Paul, but they didn’t think that that one was as big of a deal as people had made it. We still made sure that we were using 64K for our mount points that had the database files and stuff that had the smaller files we would use the default, but I still do 64 just as a best practice.

Brent Ozar: It’s way less of a deal than it used to be. When we had dedicated magnetic spinning rust then it made more of a difference, but like these days on modern sans, net app does everything internally in 4K chunks anyway for example, so it matters less. Just check with your storage vendor. For example, with Google compute engine, they actually recommend 32K, and that’s actually for SQL Server. I don’t know why, I mean just yes, oddball, they get the best performance that way. Also too, I’m like if that your biggest problem, you’re in pretty good shape. Usually, you see this when people have like 8GB of RAM.

Tara Kizer: If you’re creating a new server you might as well get it right the first time because after the thing goes live and you realize you didn’t set it up right, it’s not easy to fix. We’re talking about reformatting that driver mount point and you know, bringing the files back.

How do I monitor what’s on my SQL Server’s clipboard?

Brent Ozar: I love questions that none of us are going to know how to answer and I’m really curious to see because I don’t know the way to do this. Greg says, “Are third party SQL Server app uses speech recognition and dictation to enter information in the SQL Server and it stores all the text in the clipboard? Do you guys know of any way to monitor and manage the clipboard?”

Tara Kizer: I feel like that this needs to be thrown out. Is this the clipboard on the database server or on the client side? I would hope that this is client side, because, man, I would hope that you have the hardware in place on the database server to – that’s going to be using a lot of memory.

Richie Rump: I mean, the only thing I use with the clipboard is ditto, which is a clipboard manager, which means I can kind of go through all the stuff that I clip, but it doesn’t monitor and it doesn’t do alerting or anything like that. If you need that, no, it sounds like the weirdest thing ever.

Brent Ozar: That’s so cool, I get excited by that kind of thing. Who comes up with this idea?

Richie Rump: I’m not building it for you.

Tara Kizer: Your phone has the dictation and stuff but that’s client side. It’s not doing that on the SQL Server side, if that’s what the back end is, so I don’t think that this is being done on the back end.

Brent Ozar: I bet it’s like, people are saying things and the app puts it into the buffer and then somehow pastes it into like access or something horrific like that, and I get even more mesmerized by that because like with Apple devices, things that I put on my clipboard on my phone or my iPad switch over to my laptop too. So it would be terrifying if someone is using my phone and it goes into my copy paste buffer and goes into the database server.

Does the blocked process report use a lot of CPU?

Brent Ozar: Kelly says, “Is it normal for the blocked process report to use a huge amount of CPU? It’s coming up as who is active as the top consumer and our current blocked process threshold is ten.”

Tara Kizer: I don’t ever use it so I don’t know.

Brent Ozar: Me neither.

Tara Kizer: I use who is active and maybe sometimes if I’m just typing the assist processes on sp_who too, but for blocked processes, I do not use the report.

Brent Ozar: If you want an alternative to it, if you search for – second time Michael J. Swart’s come up on our site – blocked process report viewer. This guy knows a whole lot about the blocked process report. I would go check in with him. He has even written his own blocked process report viewer too. Any time have to troubleshoot any kind of blocking for that long, like I’m either going to log sp_whoisactive to a table or else I’m going to go get a monitoring app that just does it for me.

Tara Kizer: Yes, exactly.

I have this trigger and a multi-threaded application…

Brent Ozar: And then the last question that we’ll take is from Chase. Chase says – why don’t I read these questions before I say this? Chase says – alright Chase, we’re committed, let’s do this – “We have a trigger on a table on a project that I inherited. The application is multithreaded and at times two different threads can pick up the same method for modification.”

Tara Kizer: Stop there, something’s wrong with the trigger. That should not be happening.

Brent Ozar: Oh god. “Are there hints that can be added to the trigger to prevent collisions on inserts and updates so that a primary key violation is not throws or is there a way to ignore the primary key?” Yes, take off the primary keys.

Tara Kizer: There you go. I had a recent client who is a vendor application, no primary key constraints in the [inaudible 0:22:52.1], really old app too and they wanted to use transactional replications like, well, you’re going to have to add primary key constraints. But something’s wrong with the trigger. That should not happen. There is no reason that’s a – I’ve worked on systems that had 5000 to 10,000 bat requests per second, very high insert volume in tables that had triggers. No problems like this, so something’s wrong with your triggers.

Brent Ozar: Something odd with an isolation level, something unusual is going on.

Maybe using identity instead of scope identity, you know, things like that, things can get weird in triggers.

Brent Ozar: That’s a great point. So elaborate on that, if you search for add at identity and add at scope identity, you’re going to see…

Tara Kizer: Not add at scope identity. Scope_identity parenthesis, parenthesis, yes, I don’t know why they changed, it’s weird.

Brent Ozar: It’ll say why – you know what, just plain old identity can get overwritten by triggers.

Richie Rump: Yes, my big question is why were you using triggers for processing like this?

Brent Ozar: He inherited it. He had a nasty grandpa, it was all he got left.

Richie Rump: This smells like a queue, it breathes like a queue, it should be put into a queue in process. It’s – I know he said he inherited it, I’m hoping that’s true, but it’s a lot of things.

Tara Kizer: Chase says, “Yes, so it probably is a queue.”

Brent Ozar: It’s probably not even his real name.

Richie Rump: Sorry Chase, sorry.

How much does an app DBA know about the data?

Brent Ozar: I guess we’ll take one more because this is really interesting and we’ll all have opinions. M.M. says, “In general, how much does an application DBA know about the data content that flows through the servers? Like what’s in various tables and what data is upstream and what data is downstream.”

Tara Kizer: He/she said application DBA, so that’s kind of important to the question. I feel like an application DBA should know all about that. A production DBA may not though. At my last job, even though I was the primary DBA for a system, I never even really learned it in my year and a half there and I barely knew what the business knowledge was for that system because it was just so complex and it had been around for so long, but an application DBA, which I think is also a developer DBA, they need to know that kind of stuff.

Richie Rump: Yes, I’ve never met a DBA that I personally worked with had a title DBA that knew what was going on in the database. It was myself as the app developer/designer that understood what the data was and how it should be formatted and stored and what not.

Brent Ozar: Let’s drill down into that. How many DBAs have you worked with that even understood how SQL Server worked?

Richie Rump: They’re on my hand. Not two, but definitely one.

Tara Kizer: I’ve been fortunate, worked on really great SQL Server DBA teams here. Very, very knowledgeable.

Brent Ozar: Yes, so my feeling on this comes down to this blog post. How many servers can one person manage, and it’s not just the image that’s up on the screen. If you go through and read, I’ve got an explanation of what I’m about to tell you.

When all you do is rack and stack hardware, if you work for Google, Facebook, Amazon, Azure, whatever, you can walk a cart through a data center, anything you see with a red light, you pull it out and replace it with something with a green light. One person can manage thousands of servers.

The instant that you layer on the operating system, Windows, Linux, whatever, one person can manage the patching, configuration, and provisioning of hundreds of servers.

Then when you start to layer in SQL Server, one person can effectively manage 50 to 100 instances of SQL Server, but you’ll notice that as you scale up this ladder, the people up here tend to know less about the stuff down here. They just don’t go into the data center anymore. Then if you layer in high availability and disaster recovery, always on availability groups, and I’m going to say the word replication even though it’s not high availability [crosstalk]…

Tara Kizer: Next you’re going to say recovery mode.

Brent Ozar: Simple recovery mode. One person can manage like 10 to 50 instances of SQL Server and then by the time you get to performance tuning, you have to know database contents, you have to know what the tables mean, what the queries are doing, when it’s safe to change a query. One person can generally manage between one and five applications. Now, that application may be spread across 50 SQL Servers. I’m working on a gig this week where it’s just one app and we’re going to build hundreds of SQL Servers. One person can performance tune all of them because they’ll know the queries and tables, but you could also be on a situation where you have one server with 50 databases on it that are all different apps and you can’t know all of them. You may know the queries that suck but you’re just not going to know the data that flows in and out at all. Every now and then, I’ll also hear someone say, “I have to manage all of that and I have 150 SQL Servers” and I’m like, yes, but when I run sp_Blitz on your servers, they all suck. There are no databases and no back ups, your corruption errors out the wazoo.

Tara Kizer: How did you come up with these numbers? I feel like this is pretty accurate for the jobs I’ve been at.

Brent Ozar: Right? This I wrote in 2015, yes, so it took me years to figure this out. Like how many servers you guys manage and then came up with this over the years.

Richie Rump: He was actually rolling for a character in Dungeons and Dragons.

Tara Kizer: One of my jobs, we had 700 SQL Servers and it was like you know, five full-time SQL Server DBAs and you know, we did have HADR and then I was a performance tuner as well but performance tuning, it really was one – I’d say about three systems that I was really specialized in but we had 700 servers.

Brent Ozar: And it wasn’t that you can’t do it. I mean, you can jump over to another server and start doing performance tuning but every hour that you’re on that, the other ones are falling apart so. Alright, well thanks everybody for coming and hanging out with us this week at Office Hours. We will see you guys next week! Adios.

Tara Kizer: Bye.

Adaptive Joins And SARGability

Last Updated May 12, 2017

There’s a famous saying

Non-SARGable predicates don’t get missing index requests.

And that’s true! But can they also stifle my favorite thing to happen to SQL Server since, well, last week?

You betcha!©

One Sided

I’m going to paste in some queries, each with something non-SARGable.

There are two tables involved: Users and Posts. Only one table will have a non-SARGable predicate in each query.

SELECT u.Id, p.Score 
FROM dbo.Users_cx AS u
JOIN dbo.Posts AS p
ON LTRIM(RTRIM(p.OwnerUserId)) = u.Id --I'm icky and gross to Posts
WHERE u.LastAccessDate >= '20160101';

SELECT u.Id, p.Score 
FROM dbo.Users_cx AS u
JOIN dbo.Posts AS p
ON p.OwnerUserId = LTRIM(RTRIM(u.Id)) --I'm a loser to Users
WHERE u.LastAccessDate >= '20160101';

SELECT u.Id, p.Score 
FROM dbo.Users_cx AS u
JOIN dbo.Posts AS p
ON p.OwnerUserId = u.Id
WHERE YEAR(p.LastActivityDate) >= 2016; --I'm antiPostial

SELECT u.Id, p.Score 
FROM dbo.Users_cx AS u
JOIN dbo.Posts AS p
ON p.OwnerUserId = u.Id
WHERE YEAR(u.LastAccessDate) >= 2016; --Sacre bleUser!

SELECT u.Id, p.Score

FROM dbo.Users_cx AS u

JOIN dbo.Posts AS p

ON LTRIM(RTRIM(p.OwnerUserId)) = u.Id --I'm icky and gross to Posts

WHERE u.LastAccessDate >= '20160101';

SELECT u.Id, p.Score

FROM dbo.Users_cx AS u

JOIN dbo.Posts AS p

ON p.OwnerUserId = LTRIM(RTRIM(u.Id)) --I'm a loser to Users

WHERE u.LastAccessDate >= '20160101';

SELECT u.Id, p.Score

FROM dbo.Users_cx AS u

JOIN dbo.Posts AS p

ON p.OwnerUserId = u.Id

WHERE YEAR(p.LastActivityDate) >= 2016; --I'm antiPostial

SELECT u.Id, p.Score

FROM dbo.Users_cx AS u

JOIN dbo.Posts AS p

ON p.OwnerUserId = u.Id

WHERE YEAR(u.LastAccessDate) >= 2016; --Sacre bleUser!

What do you think will happen?

Will all of them use Adaptive Joins?
Will some of them use Adaptive Joins?
Will none of them use Adaptive Joins?

Will you stop asking me these questions to fill space before the picture?

I told you something would happen

The queries with non-SARGable predicates on the Users table used Adaptive Joins.

The queries with non-SARGable predicates on the Posts table did not.

Now, there is an Extended Events… er… event to track this, called adaptive_join_skipped, however it didn’t seem to log any information for the queries that didn’t get Adaptive Joins.

Bummer! But, if I had to wager a guess, it would be that this happens because there’s no alternative Index Seek plan for the Posts table with those predicates. Their non-SARGableness takes that choice away from the optimizer, and so Adaptive Joins aren’t a choice. The Users table is going to get scanned either way — that’s the nature of ColumnStore indexes, so it can withstand the misery of non-SARGable predicates in this case and use the Adaptive Join.

Thanks for reading!

Brent says: when you see stuff like this, it’s tempting to slather columnstore indexes all over your tables, including OLTP ones. Before you do that, read Niko’s post on using UPDATEs with columnstore, and watch his GroupBy session, Worst Practices and Lesser-Known Limitations of Columnstore Indexes. Yes, you can still get excited about adaptive joins, but they’re not quite a solution to bad OLTP queries – yet.

Adaptive Joins And Local Variables

Last Updated May 12, 2017

With new features

I really love kicking the tires to see how they work with existing features, and if they fix existing performance troubleshooting scenarios.

One issue that I see frequently is with local variables. I’m not going to get into Cardinality Estimator math here, we’re just going to look at Adaptive Join plan choice for one scenario.

Understanding local variables

When you use local variables, SQL Server doesn’t look at statistics histograms to come up with cardinality estimates. It uses some magical math based on rows and density.

This magical math is complicated by multiple predicates and ranges.

It’s further complicated by Adaptive Joins. Sort of.

A simple example

The easiest way to look at this is to compare Adaptive Joins with literal values to the same ones using local variables. The results are a little… complicated.

Here are three queries with three literal values. In my copy of the Super User database (the largest Stack Overflow sub-site), I’ve made copies of all the tables and added Clustered ColumnStore indexes to them. That’s the only way to get Adaptive Joins at this point — Column Store has to be involved somewhere along the line.

The last day of data in this dump is from December 11. When I query the data, I’m looking at the last 11 days of data, the last day of data, and then a day where there isn’t any data.

SELECT u.Id, p.Score 
  FROM dbo.Users_cx AS u
  JOIN dbo.Posts AS p
  ON  p.OwnerUserId = u.Id
WHERE u.LastAccessDate >= '2016-12-01'
GO

SELECT u.Id, p.Score 
  FROM dbo.Users_cx AS u
  JOIN dbo.Posts AS p
  ON  p.OwnerUserId = u.Id
WHERE u.LastAccessDate >= '2016-12-11'
GO 

SELECT u.Id, p.Score 
  FROM dbo.Users_cx AS u
  JOIN dbo.Posts AS p
  ON  p.OwnerUserId = u.Id
WHERE u.LastAccessDate >= '2016-12-12'
GO

SELECT u.Id, p.Score

FROM dbo.Users_cx AS u

JOIN dbo.Posts AS p

ON p.OwnerUserId = u.Id

WHERE u.LastAccessDate >= '2016-12-01'

SELECT u.Id, p.Score

FROM dbo.Users_cx AS u

JOIN dbo.Posts AS p

ON p.OwnerUserId = u.Id

WHERE u.LastAccessDate >= '2016-12-11'

SELECT u.Id, p.Score

FROM dbo.Users_cx AS u

JOIN dbo.Posts AS p

ON p.OwnerUserId = u.Id

WHERE u.LastAccessDate >= '2016-12-12'

I get Adaptive Join plans back for all of these, with accurate estimates. I’m looking at the Live Query Statistics for all these so you can see the row counts on the Users table. You can see this stuff just fine in Actual and Estimated plans, too.

The first two queries that actually return rows estimate that they’ll use a Hash Join, and both choose the Index Scan of the Posts table branch of the plan to execute. The last query chooses the Index Seek branch, and doesn’t end up needing to execute either branch because no rows come out of the Users table. It also estimates that it will use a Nested Loops Join rather than a Hash Join because of the lower number of rows.

Think Locally, Act Mediocrely

If I flip the query around to use local variables, some things change. I’m using all the same dates, here.

DECLARE @MyLeftFoot DATETIME = '2016-12-01'

SELECT u.Id, p.Score 
  FROM dbo.Users_cx AS u
  JOIN dbo.Posts AS p
  ON  p.OwnerUserId = u.Id
WHERE u.LastAccessDate >= @MyLeftFoot
GO 

DECLARE @MyLeftFoot DATETIME = '2016-12-11'

SELECT u.Id, p.Score 
  FROM dbo.Users_cx AS u
  JOIN dbo.Posts AS p
  ON  p.OwnerUserId = u.Id
WHERE u.LastAccessDate >= @MyLeftFoot
GO 

DECLARE @MyLeftFoot DATETIME = '2016-12-12'

SELECT u.Id, p.Score 
  FROM dbo.Users_cx AS u
  JOIN dbo.Posts AS p
  ON  p.OwnerUserId = u.Id
WHERE u.LastAccessDate >= @MyLeftFoot
GO

DECLARE @MyLeftFoot DATETIME = '2016-12-01'

SELECT u.Id, p.Score

FROM dbo.Users_cx AS u

JOIN dbo.Posts AS p

ON p.OwnerUserId = u.Id

WHERE u.LastAccessDate >= @MyLeftFoot

DECLARE @MyLeftFoot DATETIME = '2016-12-11'

SELECT u.Id, p.Score

FROM dbo.Users_cx AS u

JOIN dbo.Posts AS p

ON p.OwnerUserId = u.Id

WHERE u.LastAccessDate >= @MyLeftFoot

DECLARE @MyLeftFoot DATETIME = '2016-12-12'

SELECT u.Id, p.Score

FROM dbo.Users_cx AS u

JOIN dbo.Posts AS p

ON p.OwnerUserId = u.Id

WHERE u.LastAccessDate >= @MyLeftFoot

A bunch of stuff changed, mainly with estimates from the Users table. Which makes sense. That’s where u.LastAccessDate is.

Most importantly: they’re all the same now! All three estimated 143,495 rows would match from the Users. And this is where things get interesting.

The second query chooses a different index, but with the same estimated join type (Hash), and the third query bails on the Nested Loops Join it estimated when it gets no rows back.

They both have unused memory grant warnings — only the first query asked for and used its entire grant.

Lesson: Adaptive Plans can still get wonky memory grants with local variables.

But what actually happened?

Exactly what was supposed to happen. The secret is in the thresholds.

In the local variable plan, the threshold for Join choice is MUCH higher than in the plan for the literal. When the actual rows (1202) doesn’t hit that threshold, the Join type switches to Nested Loops, and likely abandons the memory grant that it asked for when it estimated a Hash Join.

The plan with the literal values has a threshold of 1116 rows for Join choices. We hit 1202 rows, so we get the Hash Join plan.

Yeah that’s what adaptive means, knucklehead.

Is that better or worse?

Well, the real lesson here is that local variables still aren’t a great choice. The Adaptive Join process figures that out now, at least.

I’m not sure if it gives the memory grant back immediately when it figures out it doesn’t need it. That’ll take some digging. Or someone from Microsoft to comment.

Thanks for reading!

Brent says: In the last year, Microsoft has been generous with backporting memory grant info all the way to 2012 via cumulative updates. However, it’s still a relatively new topic for a lot of performance tuners. To learn more about that, check out our past post on An Introduction to Query Memory, and check out sp_BlitzCache’s @SortOrder = ‘memory grant’.

Anatomy Of An Adaptive Join

Last Updated May 12, 2017

I don’t like it unless it’s brand new

When new features drop, not everyone has time to jump on top of them and start looking at stuff. That’s what consultants with nothing better to do are for.

I’ve been excited about this feature since talking to The Honorable Joseph Q. Sack, Esq. about it at PASS last October. My pupils dilated like I just found the bottom of a bottle of Laphroaig 18.

Let’s look at the Adaptive Join process in a way that most of you will eventually see it: in a query plan. Query Plan. Should I capitalize that? The jury is out.

The Operator

This is what the operator itself looks like:

Greased Lightning

And as of CTP2 of 2017, this is the information available in the Adaptive Join operator itself.

Some points of interest:

Actual Join Type: doesn’t tell you whether it chose Hash or Nested Loops
Estimated Join Type: Probably does
Adaptive Threshold Rows: If the number of rows crosses this boundary, Join choice will change. Over will be Hash, under will be Nested Loops.

The rest is fairly self-explanatory and is the usual stuff in query plans.

The Plan

Each Adaptive Join plan will have a branch for each path the optimizer can choose.

In our case, the Index Scan/Hash Join plan is the first path, and the Index Seek/Nested Loops Join is the second path.

Inside the plan, there are a couple visual cues that let you know which path the query took.

The width of the lines: the wider line likely had data flow through it
The number of executions line in the tool tip for the index access: So far as I’ve seen, the path the query chooses will have > 0, and the path the query didn’t choose will have 0 here.

Misc

In other posts I’ve mentioned stuff! And things. I’m going to bring it all together here.

You need CTP2 of SQL Server 2017 to use Adaptive Joins (for now)
At least one table has to use a ColumnStore index (for now)
Database compatibility has to be 140 (for now)
You need to be using SSMS 2017 to see the Adaptive Join operator in graphical query plans (you can see it in SET STATISTICS XML ON otherwise, it’s text only)
Right now, only Nested Loops and Hash Join are supported (my best guess is that Merge Join was left out because of the potential cost of having to inject a Sort operator into the plan)

Whew. Hey. That’s it. Go home.

Thanks for reading!

Brent says: this has interesting implications for index-tracking DMVs like sys.dm_db_index_usage_stats. In the past, I’ve explained that DMV as “counting the number of times a plan with that index in it has executed,” but I’m going to have to dig deeper now that it shows the same index twice here with two different operations (a seek and a scan.)

The 2017 Adaptive Join Optimization Eats Bad TSQL For Breakfast

Last Updated May 12, 2017

Cheeky

If you thought that title sounded familiar, you sure were right.

That’s called a classical reference.

So here we are, three whole years later, and yet another improvement to the engine promises to fix performance issues forever and ever.

While this isn’t exactly an entirely new cardinality estimator, it’s an entirely new branch in the cardinality estimation process.

Adaptive Joins

Or as I’m going to call them, the Mr. Destiny Join.

You see, in every plan, you see both possible paths the optimizer could have chosen. Right now it’s only limited to two choices, Nested Loops and Hash Joins.

Just guessing that Merge Joins weren’t added because there would have been additional considerations around the potential cost of a Sort operation to get the data in order.

Fun, right? Ever see a Sort in a query plan where you didn’t write an ORDER BY? Sometimes SQL will act like your mom and say “I’ll just straighten up your room for you” but then she finds the box under your bed and, well, anyway.

I’m 36 and I’m finally not grounded.

Why this is awesome

After the second Council Of ~~Trent~~ Brent decided that SARGable is a word, a formal edict was issued that all all queries must be SARGable in order to perform well.

With Adaptive Joins, the optimizer is way less “that’s not an iceberg” about the way it chooses query plans. It’ll go ahead and run some tests to make sure that it is, in fact, not an iceberg, before steering plan choices in a particular direction. This can be particularly helpful for non-SARGable queries, where the optimizer would usually way overestimate rows, and ask for all sorts of memory and CPU and other knick-knacks and whirlygigs to compensate.

I’m going to be kicking the wheels on Adaptive Joins and blogging about it as I go. Stay tuned!

Thanks for reading!

Brent says: when I first heard about Adaptive Joins, I thought, “This is amazing!” but I still had no idea about how complex it would get. Reading Erik’s upcoming posts, I’m even more excited about the potential this holds for down the road. Granted, it’s only for columnstore indexes right now, but still, this is gonna make 2017 a heck of a release for performance tuners.

First Responder Kit Re-Release: The Hubris Of Titles

Last Updated May 2, 2017

First Responder Kit, SQL Server

The Gods of the Copy Book Headings Frowned Upon Me

So yesterday, I pushed out a new FRK.

You could say I FRKed up, and you’d be right.

See, before I push stuff to master, I do a round of testing in our AWS lab to find version-specific bugs and make sure everything is at least compiling.

If the round of testing passes, I bump version numbers and merge the changes into the master branch.

Yesterday, I found a bug in sp_BlitzCache after bucketing some checks into version-specific sections. I did this to avoid unnecessary XML processing.

For example, versions of SQL Server prior to 2012 don’t have a NonParallelPlanReason in there, so I figured I’d save some CPU cycles by not looking for it. Trace Flags are also only in newer versions, so I added a version check for them too.

The error was that later on in sp_BlitzCache, we look at the #trace_flags table to provide information about which trace flags are enabled, and that select needed a version check, too. So I added that, bumped version numbers, and then…

Well, that didn’t make it to the master branch.

Until this morning.

After causing some grief for a user.

Sorry, friend. Have an upvote for your trouble.

Eggy

The issue has been fixed, and everything has been updated and refreshed on our end. If you use older versions of SQL Server, and sp_BlitzCache, you’ll definitely want to download the update.

I’m looking at better ways to do the last round of testing and make sure this doesn’t happen again. If anyone out there has any Hot Git Tricks (aside from “pay attention, dummy”), leave a comment.

My Walk Of Shame continues.

First Responder Kit Release: Ça plane pour moi

Last Updated May 1, 2017

First Responder Kit, SQL Server

Ah, May. Spring has sprung, and young people’s minds turn towards seasonal allergies.

Special thanks to @digitalohm for the Differential Restore code and general sp_DatabaseRestore spiffiness. May you live to an age that can only be described in Unix.

You can download the updated FirstResponderKit.zip here.

sp_Blitz Improvements

#857 Do you have any idea how much stuff is still in the default trace? Billions of things. We go spelunking in there for people running DBCC commands that are weird, then we yell at you about them.

sp_BlitzCache Improvements

#864 Having standards is important. Someday we’ll have some, until then we’ll borrow yours. This puts the Cost column in the same place across all ExpertMode outputs. Thanks @douglane!
#860 You don’t care about stuff you don’t care about, and neither do we. This moves checks into version aware buckets so we don’t go looking for problems we can’t find on older versions.
#839 Big plans! Well, they’re good for people, but not so good for SQL Server. If you have > 128 levels of XML nesting, they don’t show up in a way we can analyze them. Sorry, Tennessee.
#842 Getting stored procedure costs could be awfully slow. This fixes that.

sp_BlitzFirst Improvements

Nothing this go around

sp_BlitzIndex Improvements

Nothing this go around

sp_BlitzWho Improvements

#853 Sometimes SQL isn’t good at math when it gets big numbers. This fixes an issue with that. Thanks to @wScottw for the heads up.
#845 We now use READ UNCOMMITTED to do our dirty work. Why? Because it’s dirty.
#848 We added a table variable to hold intermediate results from sysprocesses. This makes things faster on busy servers.

sp_DatabaseRestore Improvements

#814 We now do differential restores, too. What’s the difference? Run it and see! Thanks to @digitalohm!

You can download the updated FirstResponderKit.zip here.

[Video] Office Hours 2017/04/26 (With Transcriptions)

Last Updated April 30, 2017

Brent Ozar Unlimited Team

SQL Server, Videos

This week, Erik – all by his lonely self – talks about troubleshooting remote queries, his thoughts on SQL Server 2017, licensing, index rebuilds, index fragmentation, why you shouldn’t restart SQL server every day, and much more!

Here’s the video on YouTube:

Office Hours Webcast - 2017/04/26

Watch this video on YouTube

You can register to attend next week’s Office Hours, or subscribe to our podcast to listen on the go.

Enjoy the Podcast?

Don’t miss an episode, subscribe via iTunes, Stitcher or RSS.
Leave us a review in iTunes

Office Hours Webcast – 2017/04/26

Why doesn’t SQL Server use indexes on linked server queries?

First question here is from Michael, “can you think of any reason indexes would be ignored or not used by a query being performed across a linked server? Server A is SQL Server with Native SQL linked to server B, server B…” Hmm, think of a reason why, well, I can think of lots of reasons why. Linked server queries are the pits, they are the absolute worst. It could – well no because it’s 2014 to 2012. So depending on how patched up you are on 2012; 2012 sp1 had an important fix in it where you no longer needed crazy elevated permissions to use statistics and other objects on a linked server, so prior to that you had to have SA or SA-ish superpowers on your server to get that.

Generally, though it’s just really tough to troubleshoot remote queries because SQL can make all sorts of weird decisions about where it’s going to do the processing, whether it’s locally or drag data across. So, you know, sometimes it just makes them really difficult to troubleshoot weird performance ticks like that. There’s a really good talk by Conor Cunningham, that’s Conor with one N, at SQLBits about distributed queries. It’s not going to answer your question particularly but it’s going to give you a much better idea about just how complicated distributed query processing like linked servers and stuff is. So once again that’s Conor Cunningham at SQLBits – if you just search the domain for Conor Cunningham and distributed query you’ll find it a lot easier than if you try to use the built-in search on the website.

Any thoughts on SQL Server 2017?

Let’s see here… Philip asks if I have any thoughts on SQL Server 2017. Well, of course, I have lots of thoughts on it, it’s only got CTP2, but man, they are just adding so much kickass stuff already. Like I can’t – like I don’t know what they’re going to put in CTP3 that’s going to like outdo what’s in CTP2. The adaptive join stuff, even though it’s just at the start for batch mode, executions, only for column store stuff, is incredible, it’s absolutely incredible what they’re doing. The interleaved execution, so they’re going to start making much better estimates from multi-statement table value functions – not necessarily for table variables themselves but for multi-statement table value functions they’re going to go out and do this thing where instead of just guessing the one crappy row and giving you that one awful estimate, they’re going to go and try to execute the function and get a row estimate from there and then – they’re going stop cardinality estimation, do that, then resume cardinality estimation with the row count from the function.

So that’s pretty amazing, I don’t know how many of you guys out there are using multi-statement table value functions but they just have so many damn problems because of table variables and because of other, you know, cardinality estimate bugaboos that, like, just the fact that they’re stepping in and doing this is pretty incredible. We were talking about before, the bucketed wait stats coming to query store where now you’d be able to, you know, sort of by troubleshooting scenario, whether it’s CPU or parallelism or blocking or locking or whatever else, you’ll be able to look at query store, if you have it enabled and you’ll be able to look at wait stats associated with executions of the queries, you get wait stat information. I’ve been messing around with that a little bit lately.

God, what else? Linux… Just kidding, I don’t care.

Man, every time somebody tries talking about Linux I fall asleep; but there’s a whole bunch of new cool DMVs out there that are going to start cropping up. They’re going to come in real handy for stuff like Blitz and BlitzFirst, and one that’s going to be awesome for BlitzIndex, where it looks like they’re finally starting to record query plan hash, and I believe last SQL handle for missing index requests.

So we’re finally, at long last, “like let my people go.” we’re finally going to be able to tie missing index requests to specific queries, which is something you just can’t do now unless you go mess with the plan cache and do all sorts of awful stuff in the plan cache and then tie it back to the missing index DMVs.

It’s just – it’s a mess trying to do that because you have to go and like do all this special XML processing of the missing index request to get the column names and the grouping and the ordering and then join all that stuff back to the missing index DMVs and that never really belonged in BlitzCache or BlitzIndex. Because for BlitzIndex, you know, we would have to add a whole bunch of XML processing overhead and hit the plan cache and look for stuff. And in BlitzCache, we would have to go and do a whole separate extra bunch of XML processing and then hit the missing index DMVs to try and get that. And this one DMV solves that entire problem and bridges that gap. It’s just up to you people to finally upgrade your servers.

So there’s all sorts of stuff in 2017 I’m excited about, like we talked about a little bit in the preamble, in the lead up while you were sitting here watching me do nothing and goof around with my headset. There’s, also coming, a special new switch, a query hint I’m assuming, that will allow us to force queries to run parallel. If you were in GroupBy he talked a little bit about that, Brent showed the web page on his screen, but I’m super excited about that as well.

Will a SQL 2014 license work with SQL 2008?

Let’s see, let’s find some questions, jeez – Doug has a rather awkward licensing question. He wants to know if a 2014 license is backward compatible for 2008. You know I really try to avoid answering life or death licensing questions like that. Talk to whoever you bought licensing from, whether it’s a third party vendor or whether it’s directly from Microsoft, talk to your sales rep and see what they have to say. I really hate trying to make sense of that kind of stuff, especially because I don’t know what deals you might have had in place, what kind of grandfathering or SPLA or what else you have going on. Talk to whoever you bought the licenses from and get clarification there.

What are your thoughts on maintenance plans?

Let’s see here – another Doug, a different Doug, I’m going to call you Different Doug, asks, “what are your thoughts on maintenance plan tasks such as rebuild English…” Because I can speak English so well… “Rebuild index and check database integrity?” Well you know, I think less of index rebuild tasks than I do of database integrity tasks. You know, DBCC CHECKDB, you’ve got to run that, you’ve got to know if your databases are corrupt, because corruption just gets worse over time. It’s very rare to have corruption hit one place and stay there, you know. You can have like some kinds of corruption that might be localized but generally, if a disk starts going bad, that rot tends to spread and get worse so I wouldn’t want to have that keep going over time.

Index rebuilds, you know, as infrequently as possible, I’m not a terribly big fan of those. There’s a great comment on one of our blog posts from Michael J Swart where he says, “index rebuilds are just a tremendously expensive way of updating statistics” and I couldn’t agree with a statement more. I would much rather see everyone updating statistics regularly and rigorously than doing a whole bunch of index rebuild and reorg stuff that just very rarely solves problems. And, more often than not, figuring out which indexes are fragmented and then defragmenting them takes more time and resources than your queries would ever take reading fragmented indexes. Now that’s not the most scientific statement but generally what I’ve found, especially working in environments where I was managing hundreds of terabytes of data. The amount of time that I would spend running these maintenance tasks would never ever add up to the amount of time that queries spent running.

Who cares if an index is kind of fragmented. A fragmented index is a lot more like having a, kind of, a dirty room than it is like having some sort of performance ending apocalyptic event like a comet smashing into your server happen. So I tend to leave the index rebuild and reorg stuff way, way, way off. I say reorg at 50%, rebuild at 80% and really, even Microsoft’s other recommendation, around 1000 pages, that’s 8MB. If you have trouble reading 8MB off disk and into memory, your server has problems that index rebuilds aren’t solving. For that, especially for older scripts, I’ll set page count level way up to 5000 or something, so I’m only dealing with like real big honking tables, because those small tables may not even actually get defragmented anyway. That’s kind of what I thought about that.

How upvoted will your question be by end of day?

“Any bets on how upvoted the question you discussed in your blog post will be by the end of the day?” Which, dude, which one? I’ve got to go look now. Is there a blog post today on it? I’ll be honest, I haven’t looked at the blog today. I haven’t gotten any emails about comments so I haven’t had a reason to, and sometimes Brent just does funny things with schedules. Oh okay, that one – I don’t know, hopefully, hopefully fairly upvoted because that’s a good question. I thought that I would have a really good answer to that but I kind of got beat to the punch when I started looking at all the other stuff going on and these things, then I got sucked into the start up expression predicate thing and well, I don’t know. I hope that question gets upvoted and gets some attention because it’s a good question, it’s an honest question.

I also hope that Joe’s answer gets him upvoted because he gave a pretty good detailed answer on what’s going on in that query. So props to both of them because, you know, there’s a lot of potato-y questions of Stack Exchange that – they’re tough, tough to read some of them sometimes. You’re like what are you doing man, stop that.

How interested are you in Python in SQL Server?

Let’s see here, Clark asks, “how interested are you in seeing Python within SQL Server?” Well if it’s the same as R where it’s just a data analysis and like, you know, a data scientist type add-on, I’m not that excited. What would really get me going is if, you know, like in, say Postgres, or I believe in Oracle where they have – where you can write procedural code or you can use other languages within stored procedures and functions. I would be super interested in that because there’s some stuff that’s just way better suited to that, and like, you know, we have CLR, and CLR’s okay for stuff, you have to know C#, I don’t want to learn C#, C#’s a pain in my butt.

C# is also, you know, you have to compile a DLL and add the DLL, there’s a lot of binary involved, stuff that’s scary on frozen caveman DBA, I don’t want to do all that. If I could – I’m okay with Python from a previous job where I used to do some like text file manipulation with it as part of an ETL process. So if I could like, you know, actually write stuff in Python to do, like say regex, which still isn’t in SQL Server. See now this is one of those things, you can do regex in CLR, you can implement that and that’s usable and it’s probably doable, but you can do regex in CLR and that’s pretty good. And you can do some okay stuff with PATINDEX and CHARINDEX and you know, Microsoft, for all the cool stuff they’re adding with the STRING_SPLIT function in 2016 and STRING_AGG coming to 2017 where you can skip all that sloppy FOR XML PATH stuff when you want to make a CSV list of things.

We still don’t have legit built in regex in SQL Server. It would be awesome to have, but if I could, you know, port that out to writing a function or writing a stored procedure in Python to do that, I would be much more excited than I would be about the data scientist stuff. That being said, I’m all on board with Microsoft adding that for people who do it. I think it’s a great feature for people who are data scientists or aspiring data scientists or whatever else, but it’s just not for me as a data science feature. If I could implement that in some other way, I’d be thrilled.

If I could use Python to parse query plan XML rather than writing XQuery and T-SQL, BlitzCache would run in like a second. It would be the fastest thing ever because I could skip so much garbage, I could skip so much garbage. Just as a proof of concept, a while back, I just messed around with importing one fairly large bit of query plan XML in Python and doing some parsing and it was just like done, and it called everything out. There was no, like, repeated cross supply and all the other stuff that – the cross supply with the nodes and the exists, blah, blah… There was none of that repeated overhead looking for stuff, it was all very fast. Anyway, been talking for long enough.

Small talk

J.H. asks, “having a stored proc seems to be hanging in suspended with two open transactions, it’s an update with a SELECT FROM and it’s blocking itself and not moving at all with the CX packet wait type, any idea what to do?” Boy, I would need a whole lot more on that J.H. I’m sorry, that’s just not much to go on for here. I would probably just want to go ahead and post that with a whole bunch of information on dba.stachexchange.com, I’m not being lazy on you, there’s just not enough for me to really tell you.

Should I restart SQL Server daily?

Let’s see here, Ujwal, I hope I said that right, says, “for starting SQL Server services, clear the buffer pool and buffer cache.” Yes it does. “Is it recommended to have a daily restart?” No, absolutely do not do that. “Any benefit?” Only if you have – let’s just say no. I was going to give you an outside case, but I don’t want you to say oh I have that outside case and start doing it. No, don’t restart SQL Server every day, it’s not good, it clears out a bunch of stuff you probably want cached. SQL Server goes through great pains to cache things because it’s faster to do that.

What are alternatives to linked server queries?

Doug asks, “you said linked server queries were bad, can you suggest an alternative to using linked server queries to provide a single data source to SSRS?” Tableau, just get Tableau, get a Tableau server, move data once, I don’t know. I tend to dislike linked server queries because when I’m writing a query that has to perform well that’s going to use cardinality estimation, I’m going to want to, you know, go out and do some like role emanation on the other server, I can’t count on that. What I would much rather do is write the simplest query possible to move data over locally and then have my query hit local data rather than try to do some big crazy expensive, you know, multi-join, lots of WHERE clause stuff on our remote server. So get data local and then query it.

Should I separate SSAS/RS/IS from the engine?

M.M. asks – so the gist of the question is they have physical servers and they’re moving them up to 2014. The main production server uses SSAS and SSRS, we’ll have access to one new VM, would you suggest… Generally, when I have outside components for SQL Server I want them separated from the core engine stuff. SSAS and SSRS and SSIS as well can be pretty bad data hogs, or memory hogs rather and just resource hogs in general.

Whenever I’ve had to set up environments that needed those I always wanted to stick them on their own server because generally, you can get away with sticking them on a slightly smaller scale server. Like say it could be 2014 standard edition, you could stick, you know, not as much ram and, you know, fewer processors on there and get away with it. I just don’t want that stuff walking over my production workload unless I’m on such a big honking server that everything can spread out and do whatever it wants. I’m just not a big fan of having stuff sort of collapse in on itself there.

Should I upgrade in place or build a new server?

Let’s see, going on down… Nester asks, “what do you recommend for version upgrades, in place or fresh installations?” Like we say once a week at least, fresh installations use mirroring or log shipping, or if you have the wherewithal availability groups to do your cutover to minimize downtime. In-place upgrades have a lot of issues; if anything goes wrong with the install, you don’t have a backup plan. If you get there and all of a sudden performance tanks, or like, you know, you bring people in to do a smoke test; Kaboom, that’s it, you don’t have a backup plan to go back to a previous version or even back to a previous hardware.

John asks, “what do you think of Microsoft Dynamics running on Azure Cloud?” I don’t, sorry man, not the biggest fan of Dynamics, also not the biggest fan of Azure, so good luck, good luck in your endeavors.

Do I need to rebuild indexes if I use SSDs?

Justin asks, “if running on SSDs, is there any value in rebuilding indexes?” Well, Justin, like I always say, if I’m going to say that there’s value in rebuilding indexes, I want very specific metrics on what value there is but if you’re running on SSDs there’s much less. And also if you just spend some money on memory you can always have your stuff in memory and, you know, where index fragmentation doesn’t matter? In memory – you know why? Because memory is RAM and RAM stands for random access memory and index fragmentation is – the problem usually is to find us having to randomly access stuff on disk, you hit a whole bunch of random I/O jumping around from one spot to the other. So just get as much memory as you can in a server, there you go, book, done, you don’t have to worry about the next fragmentation ever again.

How do you remember everything?

Wes asks, “how do you guys always have people’s comments from old blog posts, who wrote what and where and all this information?” We have nothing else to think about, this is it. Like back when I had a real job I would have to have like bookmarks and take screen caps and write things down, now that I have nothing else to think about I just remember this stuff now. It’s nice, my buffer cache is now just full of this stuff rather than having to think of stiff, like, what am I going to do at work today, how do I fix this problem… I just remember other things now.

How should I refresh a subset of data to dev?

M.M. asks, “is there a good resource for scripts to set up a job to store a subset of fresh prog data on the dev server, we don’t have enough room on dev to store full copies, we’ll likely be using…” Jeez, for that stuff I usually go third party. So not really built into SQL, at least not in a nontrivial way. Backup tools like Quest LiteSpeed or Redgate SQL Backup, you can do object level restores from them. So if you use them and you take your regular backups with them, you could restore a subset of that backup back to a dev server. I don’t know if you have to do any data cleansing or anything like that in between, but generally, for object level restore stuff like that I go with third party tools. I don’t want to mess with anything else.

Let’s see here, recent blog posts – Julius asks – yeah so if you type in Brent Ozar in Chrome you can just search the domain, look for a post called – you know what? I’ll just find it for you. Give me a minute to type, I’ll find it for you. It’s a post that Doug wrote, which I think has possibly the best flow chart ever created about index fragmentation. As far as in-depth detailed stuff about why, no, not really, just because it’s very situational and it requires a lot of demo work, so it’s not really great for a blog post. Brent is working on a full bore presentation on index fragmentation, so there’s always that.

Ernie, copy and pasting the same question in over and over again, my friend, is not going to get me to answer it. Please, please, please I’m begging you, post it on dba.stackexchange.com. You will have at your disposal, for free, many, many people who really want to answer your question and get those little green notifications; including a bunch of people who work for Microsoft that can tell about always on availability groups that you’ll love to hear. All sorts of people who write long extensive detailed answers to your question – post it over there, don’t be afraid. If you email me the link to your question to help@brentozar.com I will go and I will upvote your question so you will get a special green plus five from me. Please post it over there. I’m not trying to hurt your feelings it’s just too much for this little thing, I’ve got to get stuff in that I can read and then come up with – I’m by myself so it’s even harder.

Should I put SSAS/IS/RS on their own VMs, or one?

M.M. has a follow-up, “for breaking out SSAS and all those things, which one would you suggest moving to the smaller VM, or more than one?” If you’re using all of them, like hardcore, you might want to have them all on their own VMs. If you’re using all three, sort of lightly or for different tasks and what not, you might be able to get away with moving them all to one machine as long as that one machine has a decent amount of resources. So, you know, you also have to take into account what you’re doing with it. For SSIS, if you’re moving 512GB of data or something ridiculous you’re going to need RAM to back that up. So just, you know, kind of be reasonable with what you’re asking each machine to do.

What I see a lot when I’m working with clients who have these outside resources all sitting on one box is there won’t be a lot of memory on there to begin with. And then what inevitably ends up happening is they’ll say, we hit this random slowdown every time we run a report, and so we’ll look at stuff and like, you know, we’ll look at stuff in memory and whenever – we were looking at like total server memory and target server memory and stuff like that. Whenever there’s this report being run, total server memory will bump way down, target server memory will budge down a little bit and SQL will all of a sudden have to start reading pages back into memory from disk, so page I/O latch waits will go screwing way up. So all sorts of fun stuff in there.

How should I merge indexes together?

Javier asks, “how to determine which indexes are candidates to merge using the include option?” Javier, where are you, my man? My friend, we have sp_BliztIndex, if you run that in mode equals four with your database, we will tell you duplicate and borderline duplicate indexes. For your specific use case, so they’re both going to be helpful here. So for duplicate indexes, those are indexes that have all of the same key columns but maybe different included columns. For borderline duplicate indexes, those are going to be indexes that have the same leading key column but then different key columns and then maybe even different includes after that. So if you run sp_BlitzIndex in mode equals four for your database you’ll get a list of all of the indexes you could ever want to possibly condense in that one place.

Just be careful, just be careful because you can end up with some pretty big indexes if you just go and consolidate everything. So, you know, you have to kind of figure out if you’re better off having a couple of smaller indexes or one big superset index. Usually, the one big superset index is a good choice but be careful, I would urge caution when doing that stuff, especially if you have to mess with key column order… sp_BlitzIndex mode four.

We’re just about at the 45 mark, me being by myself and having to eat lunch before I start up a call at one, I’m going to take off a little bit early, we’re out of questions anyway. Everyone have a great week, actually couple of weeks because I’m going to be in France, like I said, I’m going to be on vacation for the first time ever, so I will see you when I’m back, adios, good luck.

Status Update On Query Plan Hash Bug

Last Updated April 21, 2017

Execution Plans

Not long ago I posted about a bug

BlitzCache because Query Plan Hash in the XML was being truncated.

Well, yesterday I got an email that Adam Machanic has never gotten (I kid, I kid).

CLOSED AS FIXED

Well, sort of.

It’ll be fixed in vNext, and a CU for 2016. Anything before that, and you’re on your own. And it’ll only kick in for new plans, so any old plans with busted Query Plan Hashes will continue to have them.

We have implemented zero-extending “QueryHash” and “QueryPlanHash” attributes in the XML plan to match the output of DMVs, you will see it in the next SQL vNext CTP and in the upcoming Cumulative Update for SQL Server 2016 SP1 (CU3), scheduled for release at the end of May 2017. We have not, however, modified the XML schema due to backwards compatibility issues, the attributes in question will remain strings in there.

Please keep in mind that these attributes will be generated with leading zeros only for new plans. If you have any plans stored in Query Data Store – those will remain unchanged until purged and regenerated.

If you didn’t have a reason to upgrade SQL Server before, this is clearly it.

Thanks for reading, and thanks to everyone who voted!

Look Ma, Adaptive Joins

Last Updated May 12, 2017

Execution Plans, SQL Server 2017

This probably won’t seem like a big deal soon

But I just got the optimizer to pick an Adaptive Join! It took a few tries to figure out what would cause some guesswork to happen, but here it is.

And here’s the tool tip info!

This is so cool! Now I can start picking apart the XML to add stuff to sp_BlitzCache.

In the XML, we have this information.

Papers, please

There also appears to be a whole XML path dedicated to the AdaptiveJoin, but I’m still working out what all goes on in there.

Remember that you need at least CTP2 of SQL Server 2017, and SSMS 2017 to see the new operators.

Thanks for reading!

SSMS 2017 Is Now Available For Download

Last Updated May 17, 2017

SQL Server

If you’re into that sort of thing

You can head over here to download it.

I just got it installed, and I’ll be updating this post as I find stuff.

In case you’re curious, this is a whole new install, not just an update. I had to reinstall some stuff to get it to show up in this version, namely Red Gate SQL Prompt.

I’ll be semi-live-blogging and updating this post as I find stuff!

Make like a good DBA and hit F5 repeatedly!

Updates
So uh, you can still set up a Shrink Database task via maintenance plans. Real cool.

Oh, there’s also a bug that Brent found.

It crops up when you try to back up a database with the GUI.

In case you’re wondering, you can still use Autoclose, Autoshrink, and Priority Boost. You can take the pleats out of the khakis…

Alright, so two crashes and some perilously long waits for stuff to load later, I’m gonna call it a day on live blogging stuff.

Time to go hunt down those new adaptive join operators in query plans!

If anyone is missing Utility Explorer in SSMS 2017, there’s a Connect Item over here you can vote on to bring it back.

Thanks for reading!

Hold The FiIter: Startup Expression Predicates

Last Updated April 26, 2017

Development, SQL Server

Long distance information

There was a rather interesting question posted on dba.stackexchange.com recently about CASE expression order of execution with an OR predicate. Mouthful, I know! When I saw it, I got all “I HAVE A REALLY GOOD ANSWER” and started to write demo queries.

Then I wrote this blog post instead. Sorry, j.r. — you see, I hit an actual execution plan bug that did not bode well.

You can’t both be 96% of the plan cost. This is an actual plan.

And hey, what’s with those filters?

I thought those only showed up when we did something weird, like have a predicate on a MAX datatype, or a Window function result? Why are they here?

The answer is, of course, because SQL is lazy.

Let’s take a quick look at the demo query:

SELECT SUM(x)
FROM (
SELECT 
CASE WHEN EXISTS
    (SELECT 1 FROM dbo.Posts AS p WHERE p.OwnerUserId = u.Id AND u.Id >= 1 AND u.Id  <= 22656)
   OR EXISTS
    (SELECT 1 FROM dbo.Posts AS p WHERE p.OwnerUserId = u.Id AND u.Id >= 22657 AND u.Id <= 60000)
  THEN 1
  ELSE 0 END AS [Pig]
FROM dbo.Users AS u
) x (x)

SELECT SUM(x)

FROM (

SELECT

CASE WHEN EXISTS

(SELECT 1 FROM dbo.Posts AS p WHERE p.OwnerUserId = u.Id AND u.Id >= 1 AND u.Id <= 22656)

OR EXISTS

(SELECT 1 FROM dbo.Posts AS p WHERE p.OwnerUserId = u.Id AND u.Id >= 22657 AND u.Id <= 60000)

THEN 1

ELSE 0 END AS [Pig]

FROM dbo.Users AS u

) x (x)

Both of the filters have the Startup Expression Predicate property. Here’s the first one.

What the heck does that mean?

Well, when everyone is telling you to read a query plan from right to left (even I do that, don’t be ridiculous), the order things logically occur in is from left to right. In this case, that filter predicate is only asking for rows from the Posts table where rows from the Users table meet the predicate.

If you don’t believe me, maybe you’ll believe STATISTICS IO.

For the first query we ran, the output looks like this, with the Posts table being scanned 23,672 times, resulting in 71,063 logical reads.

Table ‘Posts’. Scan count 23672, logical reads 71063, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table ‘Users’. Scan count 7, logical reads 6587, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table ‘Worktable’. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

If we change our query a little, that changes a lot. Here’s me vs. Jon Skeet, in a battle of the, well, in the battle of I-Lose-Really-Badly. That’s somewhere in France.

SELECT SUM(x)
FROM (
SELECT 
CASE WHEN EXISTS
    (SELECT 1 FROM dbo.Posts AS p WHERE p.OwnerUserId = u.Id AND p.OwnerUserId = 22656)
   OR EXISTS
    (SELECT 1 FROM dbo.Posts AS p WHERE p.OwnerUserId = u.Id AND p.OwnerUserId = 3187747)
  THEN 1
  ELSE 0 END AS [Pig]
FROM dbo.Users AS u
) x (x)

SELECT SUM(x)

FROM (

SELECT

CASE WHEN EXISTS

(SELECT 1 FROM dbo.Posts AS p WHERE p.OwnerUserId = u.Id AND p.OwnerUserId = 22656)

OR EXISTS

(SELECT 1 FROM dbo.Posts AS p WHERE p.OwnerUserId = u.Id AND p.OwnerUserId = 3187747)

THEN 1

ELSE 0 END AS [Pig]

FROM dbo.Users AS u

) x (x)

Here’s what STATS IO tells us: we did way less work. Posts was only scanned twice, and we only did 6 logical reads.

Table ‘Posts’. Scan count 2, logical reads 6, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table ‘Users’. Scan count 7, logical reads 6587, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table ‘Worktable’. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Return nothing, read nothing

Where this gets super duper extra cool, is when we run queries where no rows match at all. Check out these two suckbags.

SELECT SUM(x)
FROM (
  SELECT 
  CASE WHEN EXISTS
      (SELECT 1 FROM dbo.Posts AS p WHERE p.OwnerUserId = u.Id AND p.OwnerUserId >= 2147483647)
     OR EXISTS
      (SELECT 1 FROM dbo.Posts AS p WHERE p.OwnerUserId = u.Id AND p.OwnerUserId <= -2147483647)
    THEN 1
    ELSE 0 END AS [Pig]
  FROM dbo.Users AS u
) x (x)


SELECT SUM(x)
FROM (
  SELECT 
  CASE WHEN EXISTS
      (SELECT 1 FROM dbo.Posts AS p WHERE p.OwnerUserId = u.Id AND p.OwnerUserId = 2147483647)
     OR EXISTS
      (SELECT 1 FROM dbo.Posts AS p WHERE p.OwnerUserId = u.Id AND p.OwnerUserId = -2147483647)
    THEN 1
    ELSE 0 END AS [Pig]
  FROM dbo.Users AS u
) x (x)

SELECT SUM(x)

FROM (

SELECT

CASE WHEN EXISTS

(SELECT 1 FROM dbo.Posts AS p WHERE p.OwnerUserId = u.Id AND p.OwnerUserId >= 2147483647)

OR EXISTS

(SELECT 1 FROM dbo.Posts AS p WHERE p.OwnerUserId = u.Id AND p.OwnerUserId <= -2147483647)

THEN 1

ELSE 0 END AS [Pig]

FROM dbo.Users AS u

) x (x)

SELECT SUM(x)

FROM (

SELECT

CASE WHEN EXISTS

(SELECT 1 FROM dbo.Posts AS p WHERE p.OwnerUserId = u.Id AND p.OwnerUserId = 2147483647)

OR EXISTS

(SELECT 1 FROM dbo.Posts AS p WHERE p.OwnerUserId = u.Id AND p.OwnerUserId = -2147483647)

THEN 1

ELSE 0 END AS [Pig]

FROM dbo.Users AS u

) x (x)

For these, STATS IO is the same. The Posts table never got touched, because no predicates passed the filter. No scans, no reads.

Table ‘Posts’. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table ‘Users’. Scan count 7, logical reads 6587, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table ‘Worktable’. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Filterless

When you’re working with query plans and see a filter, it could very well be there for a good reason. That good reason being, of course, that SQL is lazy.

Thanks for reading!

Brent says – wow, I’ve never seen that Startup Expression Predicate! I had no idea that was even a thing.

Announcing Erik’s First Video Class: T-SQL Level Up, Level 2

Last Updated April 23, 2017