Blog

The Surprising Behavior of Trailing Spaces

Last Updated April 11, 2017

In every training class we do, I learn stuff from students. Here’s last week’s surprise.

Create a temp table, and put in a value with trailing spaces. Then query for it using the = and <> operators, using different numbers of trailing spaces:

CREATE TABLE #Wines (Winery VARCHAR(50));
INSERT INTO #Wines VALUES ('Cliff Lede        ');
SELECT * FROM #Wines WHERE Winery = 'Cliff Lede';
SELECT * FROM #Wines WHERE Winery <> 'Cliff Lede';
SELECT * FROM #Wines WHERE Winery = 'Cliff Lede        ';
SELECT * FROM #Wines WHERE Winery <> 'Cliff Lede        ';

CREATE TABLE #Wines (Winery VARCHAR(50));

INSERT INTO #Wines VALUES ('Cliff Lede ');

SELECT * FROM #Wines WHERE Winery = 'Cliff Lede';

SELECT * FROM #Wines WHERE Winery <> 'Cliff Lede';

SELECT * FROM #Wines WHERE Winery = 'Cliff Lede ';

SELECT * FROM #Wines WHERE Winery <> 'Cliff Lede ';

SQL Server simply ignores the trailing spaces:

When it comes to trailing spaces, SQL Server is the honey badger

The mind-blowing part to me is the <> operator – that seemed utterly crazy to me.

And if you add another table, and join to it, guess what happens:

CREATE TABLE #PreferredDrinks (Drinker VARCHAR(50), FavoriteWine VARCHAR(50));
INSERT INTO #PreferredDrinks VALUES ('Brent Ozar', 'Cliff Lede');
INSERT INTO #PreferredDrinks VALUES ('Richie Rump', 'Cliff Lede   ');       /* Has 3 spaces */
INSERT INTO #PreferredDrinks VALUES ('Tara Kizer', 'Cliff Lede        ');   /* Has the exact number of spaces */
INSERT INTO #PreferredDrinks VALUES ('Erik Darling', 'Manischewitz');

SELECT *
  FROM #Wines w
  INNER JOIN #PreferredDrinks p ON w.Winery = p.FavoriteWine;

CREATE TABLE #PreferredDrinks (Drinker VARCHAR(50), FavoriteWine VARCHAR(50));

INSERT INTO #PreferredDrinks VALUES ('Brent Ozar', 'Cliff Lede');

INSERT INTO #PreferredDrinks VALUES ('Richie Rump', 'Cliff Lede '); /* Has 3 spaces */

INSERT INTO #PreferredDrinks VALUES ('Tara Kizer', 'Cliff Lede '); /* Has the exact number of spaces */

INSERT INTO #PreferredDrinks VALUES ('Erik Darling', 'Manischewitz');

SELECT *

FROM #Wines w

INNER JOIN #PreferredDrinks p ON w.Winery = p.FavoriteWine;

All of the joins work regardless of the trailing space count:

This behavior is documented in KB 316626. Turns out the K in KB really does stand for knowledge.

Does a TempDB spill mean statistics are out of date?

Last Updated April 11, 2017

Brent Ozar

Execution Plans

When you see a yellow bang in an execution plan, and you hover over it to discover that SQL Server spilled out to TempDB, your first reaction is to wail in pain and rend your clothes.

After that, though, it seems like people always ask the same question:

“Why doesn’t updating stats fix this?”

In the example here, I’m using Cade Roux’s rank-and-percentile query against the Stack Overflow data dump. It’s not that statistics are out of date – look at the estimated number of rows versus the actual number of rows.

They’re identical. The stats are fine.

And no matter how many times I update statistics, I’ll still get a ~400MB spill to disk.

I’m not using a small server, either: my virtual machine has 32GB RAM, and I’ve allocated 28GB of that to SQL Server. There are no other queries running at the same time – it’s just one lonely query, spilling to disk for the lack of 400MB RAM.

When I right-click on the SELECT operator and click Properties, I get the fancy properties window in SSMS, which shows me more details about the query’s memory grants:

Note the rigorous devotion to alphabetical order in the properties window. It’s up to you, dear reader, to figure out what you’re supposed to read, in which order. Here, I’ll give you a peek at my decoder ring:

DesiredMemory – look here first. It’s how much memory SQL Server wanted.
RequestedMemory – sometimes, this is less than desired. (I’d really like a Porsche 911R, but I know better than to ask my wife for the budget to buy one.)
GrantedMemory – how much I actually got.

SQL Server wasn’t off by too much: this operator only needed 7,643 more 8KB pages, around 60MB. Close – but no cigar.

Fixing it with the claw hammer: MIN_GRANT_PERCENT

Starting with SQL Server 2012 SP3, 2014 SP2, and 2016, I have a nifty new query option:

OPTION (min_grant_percent = 50)

1	OPTION (min_grant_percent = 50)

That brains SQL Server into giving me all the memory I need, and then some. The spill is gone from the execution plan, and my memory grants are way higher:

But this has a terrible drawback: I’m claiming memory that no one else can use while my query runs. And how do I know if I even got that grant percent right, anyway? It’ll change over time as my data changes.

Fixing it with the scalpel: CREATE INDEX

Whenever you’re tuning queries, look at the big picture: if the data was already sorted in the order you needed, you wouldn’t be having this problem with the sort operator. And SQL Server’s trying to give you a hint in the form of a missing index recommendation:

That impact number might seem low at just 14, but remember – that’s a suggestion SQL Server came up with before it even executed the query. Before the query ran, it didn’t even know about the TempDB spill, so the impact’s going to be even larger.

Can you fix every spill? Of course not – sometimes, you’re dealing with data that’s larger than the physical box’s memory. But even when you can fix it, just make sure you’re fixing it with the right tool.

Free Webcast Next Tuesday: How to Triage SQL Server Emergencies

Last Updated April 11, 2017

Brent Ozar

Processes and Practices

I’m speaking at the PASS DBA Fundamentals Virtual Chapter next Tuesday! Here’s the topic:

When the phone rings, you need a repeatable process to discern the symptoms, root cause, what will happen if you don’t take action, and based on that, what actions you should take. You’ll see Brent step through several emergency scenarios live, learn what tools he uses, and how he uses them. Brent Ozar loves to make SQL Server faster and more reliable. He created sp_Blitz® and sp_BlitzFirst®, and he loves sharing knowledge at BrentOzar.com.

Fun With Logging Dynamic SQL

Last Updated April 7, 2017

Erik Darling

T-SQL

While working on a demo

I thought this might make a fun aside to share on the blog, because working with dynamic SQL can be challenging. Especially if you don’t have a monitoring tool or application profiling to figure out who ran what and when, most of the time you’ll have no idea what went wrong.

For example, the dynamic SQL that my query produces (sometimes) looks like this:

SELECT 		
  u.DisplayName, 
  SUM(CONVERT(BIGINT, u.UpVotes)) AS [UpVotes], 
  SUM(CONVERT(BIGINT, u.DownVotes)) AS [DownVotes], 
  SUM(CONVERT(BIGINT, u.Reputation)) AS [Reputation] ,				
  SUM(CONVERT(BIGINT, p.CommentCount)) AS [TotalComments], 
  SUM(CONVERT(BIGINT, p.Score)) AS TotalScore ,		
  SUM(CONVERT(BIGINT, c.Score)) AS [CommentScore] 
FROM Users AS u  
JOIN Posts AS p ON p.OwnerUserId = u.Id 
JOIN Comments AS c ON c.UserId = u.Id  
                   AND c.PostId = p.Id 
WHERE 1 = 1 
AND u.Id = @i_UserId 
AND p.PostTypeId = @i_PostTypeId 
GROUP BY u.DisplayName  
ORDER BY SUM(CONVERT(BIGINT, u.Reputation)) DESC

SELECT

u.DisplayName,

SUM(CONVERT(BIGINT, u.UpVotes)) AS [UpVotes],

SUM(CONVERT(BIGINT, u.DownVotes)) AS [DownVotes],

SUM(CONVERT(BIGINT, u.Reputation)) AS [Reputation] ,

SUM(CONVERT(BIGINT, p.CommentCount)) AS [TotalComments],

SUM(CONVERT(BIGINT, p.Score)) AS TotalScore ,

SUM(CONVERT(BIGINT, c.Score)) AS [CommentScore]

FROM Users AS u

JOIN Posts AS p ON p.OwnerUserId = u.Id

JOIN Comments AS c ON c.UserId = u.Id

AND c.PostId = p.Id

WHERE 1 = 1

AND u.Id = @i_UserId

AND p.PostTypeId = @i_PostTypeId

GROUP BY u.DisplayName

ORDER BY SUM(CONVERT(BIGINT, u.Reputation)) DESC

Notice those two variables in the WHERE clause? Part of how using parameterized dynamic SQL protects you is by not just sticking whatever stinky nonsense users submit directly in to your dynamic SQL block. That’s also what makes it challenging to troubleshoot. Sure, you can head to the plan cache, but you may not find anything there.

Logging Tables

If your company is too cheap to get you monitoring tools, well, that’s another discussion. If you want to figure out why dynamic queries are slow anyway, here’s one option. Without showing you all of what my dynamic SQL does, I’ll show you how the logging works. It starts with a table!

CREATE TABLE dbo.Logger
(
  Id BIGINT IDENTITY(1, 1),
  RunHash UNIQUEIDENTIFIER,
  UserName NVARCHAR(128),
  StartTime DATETIME2,
  EndTime DATETIME2,
  RunTimeSeconds AS DATEDIFF(SECOND, StartTime, EndTime),
  DynamicSQL NVARCHAR(MAX)
);

CREATE TABLE dbo.Logger

(

Id BIGINT IDENTITY(1, 1),

RunHash UNIQUEIDENTIFIER,

UserName NVARCHAR(128),

StartTime DATETIME2,

EndTime DATETIME2,

RunTimeSeconds AS DATEDIFF(SECOND, StartTime, EndTime),

DynamicSQL NVARCHAR(MAX)

);

You can add in more stuff if you want, this is just the bare minimum I’d collect.

Now, in the stored procedure, you can do something like this.

DECLARE @RunHash UNIQUEIDENTIFIER = NEWID()

...Whole Mess O' Stuff...

INSERT dbo.Logger ( RunHash, UserName, StartTime, EndTime, DynamicSQL )
        VALUES(@RunHash, SUSER_SNAME(), SYSDATETIME(), NULL, @Command)

EXEC sys.sp_executesql 'Your lousy query'

UPDATE dbo.Logger 
SET EndTime = SYSDATETIME()
WHERE RunHash = @RunHash

DECLARE @RunHash UNIQUEIDENTIFIER = NEWID()

...Whole Mess O' Stuff...

INSERT dbo.Logger ( RunHash, UserName, StartTime, EndTime, DynamicSQL )

VALUES(@RunHash, SUSER_SNAME(), SYSDATETIME(), NULL, @Command)

EXEC sys.sp_executesql 'Your lousy query'

UPDATE dbo.Logger

SET EndTime = SYSDATETIME()

WHERE RunHash = @RunHash

But all that will log is your dynamic SQL with the variables. What I prefer to do is spin up a placeholder that I can replace values in without messing with the dynamic SQL to be executed. Since this command will never be executed, I can use REPLACE to figure out what my values are.

DECLARE @PrettyPrint NVARCHAR(MAX) = N''
DECLARE @RunHash UNIQUEIDENTIFIER = NEWID()

SELECT @PrettyPrint = REPLACE(
            REPLACE(
              REPLACE(@Command, 
                '@i_PostTypeId', ISNULL(@PostTypeId, N'')
                  ), '@i_UserId', ISNULL(@UserId, N'')
                    ), '@i_Top', ISNULL(@Top, N'')
              )

INSERT dbo.Logger ( RunHash, UserName, StartTime, EndTime, DynamicSQL )
        VALUES(@RunHash, SUSER_SNAME(), SYSDATETIME(), NULL, @PrettyPrint)

UPDATE dbo.Logger 
SET EndTime = SYSDATETIME()
WHERE RunHash = @RunHash

DECLARE @PrettyPrint NVARCHAR(MAX) = N''

DECLARE @RunHash UNIQUEIDENTIFIER = NEWID()

SELECT @PrettyPrint = REPLACE(

REPLACE(

REPLACE(@Command,

'@i_PostTypeId', ISNULL(@PostTypeId, N'')

), '@i_UserId', ISNULL(@UserId, N'')

), '@i_Top', ISNULL(@Top, N'')

)

INSERT dbo.Logger ( RunHash, UserName, StartTime, EndTime, DynamicSQL )

VALUES(@RunHash, SUSER_SNAME(), SYSDATETIME(), NULL, @PrettyPrint)

UPDATE dbo.Logger

SET EndTime = SYSDATETIME()

WHERE RunHash = @RunHash

Logjammin’

I double logged my stored proc calls so you can get a better idea of what happens. Here’s what my logging table looks like:

You can already see one spot where an optional @Top was substituted with the actual number “the user” asked for. Don’t give me any application security guff here, I log in as sa because I am the app and the app is me.

Zooming in on where the other variables live in the WHERE clause, now we can see which ones were called.

This can pose some security risk. If some malicious person were to realize that you’re logging these calls, they could change your code to execute the printable version instead. If security is a concern, you may want to offload the logging code to other stored procedures.

If you want to really ball out on this one, you could even frame the replaced values in special delimiters like ‘||’ and then parse those out as well. But I’ll leave that as an exercise for the reader.

That’s you.

Thanks for reading!

SQL Server 2017 Release Date: May 31?

Last Updated May 12, 2017

Brent Ozar

SQL Server 2017

At the SQLbits conference in the UK, Victoria Holt sat in on a session about the State of the SQL Nation. She writes:

“SQL Server vNext will be launched this year with the UK launch 31 May 2017.”

Keep in mind that “launch” is a marketing term, very different from “release.” (Even “release” is tricky because the code can be released to manufacturing without being available to download publicly.)

Victoria’s post also lays out what’s new in vNext. I don’t think it’s a list that’ll get most conventional DBAs excited, but Linux & Docker support had to have been challenging enough for Microsoft already. (Example: last month’s CTP 1.4 release fixed GETDATE() and introduced VDI support for backup software.)

A May marketing launch would tie in nicely with PASS’s delayed call for Summit speakers, too. It gives the community the best chance to build sessions that will be highly relevant when October rolls around. (Granted, this also means speakers will be building session content for things they haven’t actually done in real-life customer implementations, but that’s how it always is with brand new technology – folks have to kick it around in the lab and then guess how it’ll work in real life. It is what it is.)

Update April 19 – Microsoft’s marketing material is now officially referring to it as SQL Server 2017:

[Video] Office Hours 2017/4/5 (With Transcriptions)

Last Updated April 10, 2017

Brent Ozar Unlimited Team

SQL Server, Videos

This week, Erik, Tara, and Richie discuss altered tables and update statements, parameter sniffing issues, their issues with Azure, a resource for column store indexes, AlwaysON availability groups, killing sleeping connections, and fill factor value for indexes.

Here’s the video on YouTube:

You can register to attend next week’s Office Hours, or subscribe to our podcast to listen on the go.

Enjoy the Podcast?

Don’t miss an episode, subscribe via iTunes, Stitcher or RSS.
Leave us a review in iTunes

Office Hours Webcast – 2017-04-05

What are the pros and cons of Platform-as-a-Service?

Erik Darling: Richie, I think this is one that you’ll like. Platform as a service, pros and cons, because you like Cloud stuff more than I do. You like things that happen.

Richie Rump: I do?

Erik Darling: In the cloud, a lot more than I do. Like, things that happen in the Cloud. I’m annoyed with the Cloud constantly.

Richie Rump: I do, and if you’re coming from on-prem to Cloud, you probably will be annoyed as well. There’s a lot of things – I haven’t really played a ton with Azure SQL but you know, just a little bit that I have played with, there’s a lot of stuff like why isn’t that there and why isn’t that there and why isn’t that there.

Erik Darling: There’s a lot of our stored procs that just won’t work up there because there are you know, features in certain systems – we can’t use sp_BlitzCache up there because you can’t use a global temp table. I don’t know what you expect me to tell you but…

Richie Rump: But the same thing is true when you code for the Cloud as well. The way that you would program is significantly different if I were to just write a typical server application, so there’s plus and minuses for both sides. I don’t have everything the top of my head other than you’re going to need to try it for yourself. What is going – I thought it was actually Brent Ozar coming on. This is better. But this is better. I was getting scared there for a second. But PaaS – it is – you’re going to have to try it out along with your requirements and see if it’ll work for your environment and your application that you’re building.

Erik Darling: PaaS changes a lot. Like, Microsoft just pulls things in and takes things out. CLR got yanked out very unceremoniously recently because…

Richie Rump: And we’re going to put it back, and nothing.

Erik Darling: Yes, they discovered that there were security issues with letting people use CLR up in Azure then they were just like, oh wait, you can’t do that, all this stuff you want is gone. All that CLR just got pulled out and there wasn’t any like, case by case, like oh you can use it in … you’re special, like it was just like no CLR, figure it out. So pros and cons – con is you control nothing. You just have to…

Richie Rump: Yes, I could definitely see them say, hey you know what, the benefits for this feature really doesn’t outweigh the security concerns that we have and really, who’s complaining about I don’t have CLR in SQL Azure, so we’re probably just not going to put any resources on fixing this thing. It’s just – we’ll just rip it and we’ll just tell people it’ll be back at some point and see who complains the loudest and frankly I don’t see hardly anybody really complaining that I don’t have the .NET CLR inside of SQL Azure so now that we have something like Lambda functions, I can you know – once they start getting better integration with Azure SQL and Lambda functions, now I could write Python or I could write f# or I could write c# or JavaScript or whatever and have it interact with my SQL. So hey, functional languages man, it’s…

Erik Darling: Richie said f#. I want to hit the bell on that every time Richie says f#.

Richie Rump: Hey, functional languages. It’s all the new hotness man, all the cool kids are doing all the functional languages

Erik Darling: Never heard you say f# before like, without it being derogatory.

Richie Rump: I’m not going to talk bad about f#. The folks who know how to wield that broad sword are amazing. You can do a lot of cool stuff there.

Erik Darling: Interesting.

Richie Rump: Yes.

What resources get used when you ALTER TABLE?

Erik Darling: Alright, let’s see here. Ron, I’m sorry, I don’t get that question. Tara, do you want to – there’s a question there from a fellow named Ron. Make more sense to you, it didn’t make sense – thing, kind of went by me. J. H. does ask though, “What resources get used mostly when issuing ALTER TABLE and update statements, CPU desk memory?” Jeez.

What resources get used most? Well, certain ALTER TABLE things don’t use all that much resource. Like, if you alter a table and you add a column that’s nullable and you don’t have a default in there, it’s pretty quick. It doesn’t really use much resource. For updates, depends on the query. If you have a totally unindexed query and you let that run, it’s going to use a whole lot of everything. So then it has to, you know, first run that query, read a bunch of pages, figure out a bunch of stuff out, then write all that stuff and then you know, internally SQL figures out if it can just do an in-place update or if it has to do an insert and delete, and all this other stuff. So I mean, you’re looking at a bunch of resources depending on how well optimized the query and indexes are to begin with. If you’re updating a column that is in 50 indexes, you’re going to do a lot more work for like, you know, writing and what not than if you update a column that’s in five indexes. So that question’s a little weird, but alright. I see why you didn’t want to put that on Stack Exchange, you would have gotten down-voted into oblivion.

Richie Rump: Oh no, but why would anybody do that? An honest question on Stack Exchange?

Erik Darling: Well, honest is different from good. Honest and good are two totally different things.

Richie Rump: I see how you backed away very quickly from the In-Memory OLTP question. That was pretty amazing there.

Erik Darling: Well, because the person left. I’m not going to ask the question If someone left.

Richie Rump: They leave but they ask a question saying, hey you know, I’m going to actually watch this later, maybe.

Erik Darling: Maybe, but they didn’t say, hey I got to go, sorry. I just got a someone’s left.

Richie Rump: And then again, In-Memory OLTP, just – it makes them freak out, which is pretty amazing.

Erik Darling: Yes, so like you know, I was honest with my wife about the fact that I spent $400 on a bottle of Yamazaki 18, that was not good. So there’s like honest and good. Two different things. I didn’t mean to insult your question, I just wanted to draw like my boundaries between honest and good.

Richie Rump: I’m so sorry, that you spent that much money on the bottle. I’m sorry for that.

Erik Darling: Oh no, it’s one of those, I put it in my little bar cart and I just say that I have it. I don’t let anyone touch it. My precious, my precious – just one of those little White whale holy grail things that you get. Of course I say that about a lot of different bottles of scotch and they end up gone pretty quick. What can you do?

Why is parameter sniffing an issue in only some of my procs?

Erik Darling: Let’s see, we have a parameter sniffing issue in some stored procedures but not an issue in other stored procedures. Any ideas? Write the stored procedures where you have the parameter sniffing problems, just like the ones where you don’t have them and then they’ll go away. That sound about right, Tara?

Tara Kizer: No.

Erik Darling: No? No, you shouldn’t? What would you do if some of your stored procedures had parameter sniffing?

Tara Kizer: So, I would see what the problem is here, compare the execution plans is it an optimization and is it optimizing for a value that isn’t representative of the other values. I just worked out a client this week that is having sort of a parameter sniffing issue between two servers and it just – one server’s locked in a clustered index scan, the other one is locked in a non-clustered index seek and you know, using index hints has worked around the problem but doesn’t necessarily solve – it doesn’t tell us what the root cause is, so anyway, to answer the guy’s question, I’d be comparing execution plans and seeing what I could do to get it to use the better execution plan for the rest of the grammar values.

Erik Darling: Sounds like a good idea. At what point would you resort to something crazy like plan guides?

Tara Kizer: When a Microsoft engineer’s on sight and looking at the system request. That’s when I started using plan guides. It really was onsite for an update for ESCO, because it was such a major upgrade we decided to pay someone to watch me. Plan guide or index hints or any of those, option for – it doesn’t matter what you do. Plan guide is just an easy – I shouldn’t say it’s easy. It’s a hard opter to put in place because it has so many requirements that doing the optimize four trick or index hint trick in the actual query can be easier. Plan guide is just another object that has those options in it.

Erik Darling: Sounds good.

Richie Rump: Yes, sometimes I just, you know, depending on the query, just have a recompile every time and just let it do it and automatically it gets better, and if I need to put more effort into that and if that’s not the solution then so be it, but I think sometimes especially when I’m doing the time performance tuning, we need to have that exit goal – okay, what point is good enough, because I could spend another week on this problem, but sometimes the fastest, easiest way is maybe not the best way, but it’s the way that meets the requirement to get out of the performance tuning.

Erik Darling: Yes, you know what, big thing for me with the parameter sniffing is always making sure that my indexes are lined up so that SQL doesn’t have an option between too many different plans. Tara ran into a weird one where it was the same everything and the query on one server chose a totally different index and everything from a query on another server. But for me, if it’s just happening on one server then I’m most concerned that like you know, it’s like choosing a weird key lookup plan when it should be choosing like a different plan. You got to get rid of that key lookup to get the same shape plan in place.

What’s your experience with Azure?

Erik Darling: There’s a long question there from Ian. I think that’s a Tara question, so if you want to read that and answer that in the background, we’ll get to something else. J has a follow up question about Azure. “Do you have similar experience of the features that…” – no, I don’t touch Azure. I don’t want to touch Azure. I don’t want to be around Azure. You know, I’ll take Amazon RDS and have like a complete list of stuff that doesn’t work and isn’t compatible that’s pretty stable over time.

Richie Rump: Yes, I think that’s what gets people especially in the Microsoft community kind of confused about us, is that you know, we’re pretty agnostic about the Cloud. I do a lot of work in AWS, we obviously wrote the white papers for Google Cloud, SQL server in there, but Azure – I dabble in Azure but by far, I use AWS more than the other two. So we could give you a big general idea but when we get to some of the specifics, we haven’t really dived into some of those, probably outside of Google Cloud and AWS, which I think surprises some of our more Microsoft-y type followers.

Erik Darling: Yes, Azure has kind of been weird from the get go and we try to get our labs set up in Azure the way we have it set up in AWS a long time back and after three days – and I think Jeremiah, who’s smart doing stuff in Azure, he said I don’t know, no, I throw my hands up. So there’s that.

Richie Rump: It feels like Azure has this kind of enterprise-y security layer that they’ve kind of put in their Cloud where AWS doesn’t really have that.

Erik Darling: There’s a portal and a beta portal and a preview portal and then PowerShell. It’s like, I don’t know what to do.

Richie Rump: Yes, when I’m messing with Azure, I’m always you know, messaging our Microsoft community guys like how do I do this, how do I do that? With AWS, you do this, but how do I do it over there and typically, AWS is a little simpler to get into and out of as opposed to Azure, there’s layers upon layers that you kind of have to go through. Yes, it’s peculiar. I would definitely say it’s definitely peculiar, but not to say that it’s bad. It’s just additional steps to kind of have to go through in general, to get stuff done.

Erik Darling: I would agree.

Any good references on columnstore indexes?

Erik Darling: Gregory asks if we have any good references or thoughts on column store indexes. I sent a link out into chat through a website from a fellow name Niko Neugebauer, I believe that’s how you pronounce it, I couldn’t be a 100% on that, but it’s Nikoport.com and he’s been writing for years and years about column store and he has more good thoughts on it than I do. As far as order of columns, the number of columns, it doesn’t matter in column store, the way it does with row store indexes. For non-clustered you know, there are obviously some limits on column length at this point. I’m not sure if you can have max column types in there just yet. I think I read something – no, that was hekaton. Yes, because when I go to column store to find my stack overflow database, I always have to ditch the body column from the post table, so it does have some limitations but as far as the number of and the order of – and you can’t column store index a computed column either; another funny thing. So there are limitations, I would let the limitations guide me more than some arbitrary number though.

How should I train someone on Always On Availability Groups?

Erik Darling: Tara did you want to answer that long question about the always on stuff?

Tara Kizer: From Ian?

Erik Darling: Yes.

Tara Kizer: Sure. So Ian is saying he needs a – only DBA – the company is large enough to need another DBA and ASCII training, an emergency backup for our main instance, they’ve got AlwaysOn, Analysis Services Integration Services, Reporting Services – they’ve got everything apparently. Trainee is knowledgeable on SQL server but it’s another office, it’ll be remote and it’s a second language, so he’s asking how to train this person. Ian is asking the question you know, that he has been doing it for eight years and hasn’t documented it. It sounds exactly like what I’ve had to do in the past. You know, I’m not a documenter, that’s for sure, everything I store is in my head and people need to learn that they’re going to have to ask me because I don’t write stuff down. So I’ve had to train people in the past. One time I was going on maternity leave for three months and we hired a consultant to replace me while I was gone so I had to get that person up to speed on – it was like five years worth of knowledge of troubleshooting performance issues in production and that’s hard to do because it’s so specific to applications at the time. So anyways, what worked well for us is shadowing. I realize that this other person is remote but you can do a WebEx session or go to webinar, Skype, whatever you’re – Google hangouts or whatever and share your screen, talk through everything that you’re doing for one, two hours per day. Don’t do a whole day because that’s just going to burn everybody out, but you know, share your screen and talk with the person. You know, what are you doing day to day, because that’s really what they need to know is day to day, and also talk about the things that are – come up as emergencies. What are the types of things and have that person be responsible for documenting going forward and that’ll be part of their training, you know, writing it down.

Erik Darling: Sounds good to me.

Can I add an AG listener online?

Erik Darling: Suzie asks if she – well, I’m going to assume Suzie is a she. Suzie is quite a gal’s name. “Can I create a listener for an existing AlwaysOn availability group on the day or do I need to wait for after hours? Can it cause any issues?”

Tara Kizer: I mean, you certainly can. Will it cause any issues, probably this one time it will cause an issue, so you know, I would run the command later in the day. But you should be able to add a listener without an issue, they’re just – you know, weird things happen when you’re working in production, you know, you’re rolling the dice.

Richie Rump: Yes, I mean, I would always opt to do stuff like that, you know, later or during a maintenance window but creating a listener just seems like one of those sort of like nebulous things where it’s like, okay, I’ve got to create this and then nothing’s actually going to use it until I tell it to…

Tara Kizer: In theory that’s how things should work but I don’t know about everyone else, but I have bad luck in production.

Erik Darling: So Suzie, the verdict is it’ll probably be okay but you probably also want to wait.

Do you have a server health checklist?

Erik Darling: James asks, “Do you have a health server checklist template?” Now James, you’ve been in Office Hours a lot. You’ve been to our training, you’ve been to our online training. You should have our first responder kit with the set up checklist in it, James. I know you have the first responder kit. I would definitely want to download, if you don’t have it handy, download that and look for the set up checklist. It’s a pdf, it’s not really a template, but you can go through and, you know, check things off as you go, make sure that it’s all done. Other than that you know, our healthy server template is running sp_Blitz. Making sure that nothing crazy comes back in there, so those are our versions of healthy server checklist templates.

Do sleeping connections cause negative impacts?

Erik Darling: J. H. asks, well, says, “I have a bunch of sleeping connections not doing anything for days. Are these connections taking up resources or are there any negative impacts? Is it good to kill these connections on a daily basis? What reasons to kill them or not to kill them?” Well, off the top of my head I could give you a good reason not to. Now, I’m going to pop a long command in there for sp_WhoIsActive that’s going to give you a bunch of information about what a query has been up to. If you run that, and you see that one of those sleeping transactions has a bunch of tran’ log writes, then it’s also blocking – you might also see a blocking – if you see it has a bunch of transaction log writes, just know that if you kill it, it may have to roll back whatever transaction log stuff it wrote out. So a good reason not to just kill them is you may have an unforeseen rollback. A good reason to kill them, they might be blocking stuff.

Tara Kizer: If these sleeping connections don’t have an open transaction and they aren’t blocking anything then I don’t bother killing them. They aren’t using many resources, or using a teeny, teeny, teeny bit of resources. It’s like 8K of memory or something like that, you likely have sleeping connections due to connection pulling so applications will go out and connect to a server and say the app admin has configured it to go ahead and do 200 connections, but maybe you have only 50 connections are needed. Those 200 connections are still going to be there and if you go ahead and kill them, the app’s just going to recreate those 150 connections again. So take a look at your system. What are those sleeping connections doing and do you have connection pooling in place? Talk to your web ops people and see what number they’ve configured for the pool.

Richie Rump: There’s a task I need to set up. Pooling was on my list, I completely forgot about it.

Should I use fill factor 100% or something else?

Erik Darling: Let’s see here. Oh, there was a fun question. Where did that question go? There was a fun question that I wanted to ask.

Tara Kizer: I’ll let you find it. How about Renee’s question that just came in? “Would you recommend to use a fill factor of 100% for indexes or would you go with a different value?” I like that question.

Richie Rump: I would absolutely go with zero. I think zero is the way you need to go.

Tara Kizer: Zero and 100 are equivalent when we’re talking about fill factor, so he is still right. Default to 100%. Don’t change it at the instance level. You need to take it by case by case basis. If you’re talking about GUIDs and you’re trying to avoid page splits then maybe reduce your fill factor. The recommendation is to keep your fill factor 100% except when page splits are a problem, like is the case with GUIDs. But I had a client a few weeks ago that said that they had lowered the fill factor from 100% down to 75, 80. There was something like that said 78, or some weird number and it was because what they had encountered page splits years ago, and that was causing problems and I was like, is it causing problems now because your reads probably are slower with these lower fill factors. It’s just not something they realized, that read performance goes down when you lower your fill factor down. Yes, it helps out with your inserts to avoid page splits but usually, select performance is way more critical than the insert updates and deletes because the selects are what users are using generally.

Richie Rump: I think that’s kind of a tough thing in general to try and figure out is at what point do I stop optimizing prematurely, right? What is going to be a good median to say hey, this is a good performance tuning whatever I do to start off with versus this is a problem now and I need to go and fix it.

Erik Darling: The stupid thing about fill factor for me anyway, is – I mean, it’s like – one of the first things is like introducing like enforced fragmentation into your indexes, you’re leaving a certain amount of empty space in there that SQL is going to have to read and store on disk and in memory. The second thing about fill factor for me is that it’s not maintained on insert update and delete. It’s only enforced when you rebuild your index. So like, in between rebuilds, your index will probably do better on page splits but unless you rebuild your index fairly frequently, all those new pages, all those split pages, they’re just going to fill up to 100% and split again, so it’s like one of those like well, I chose a 70% fill factor for my GUIDs, well I still have lots of page splits because all these new pages keep getting split. So like, you would have to be out of your mind to constantly rebuild an index to try and put that back in place. So I usually just leave fill factor somewhere between 90 and 100 and just never bother with it again. I prefer 100 because then I don’t have to change anything.

Richie Rump: Yes, five is usually a bad idea.

Erik Darling: Five is terrible.

Richie Rump: So if you’re considering that, don’t.

Erik Darling: Don’t use a fill factor of five…

Unless you’re doing a fragmentation demo, then it might be okay.

Richie Rump: Unless you want to call us later and then we’ll explain to you why that’s a bad idea.

Erik Darling: Yes. I can do that, in like two words.

How do I talk people out of including every field in an index?

Erik Darling: Alright, I found a question that I thought was interesting. It was from Ben. “What’s the best way to convince someone that a new index with every field as an include is not necessarily a good thing?”

Tara Kizer: I know that you and Brent will say that when you have it – when we see a client that has a – maybe the key is fine but then they have like, 48 includes, to go ahead and chop off the includes and check out the performance of the query, let it do a key lookup to the clustered index or also test it with the covering index with all those includes. Now, of course the key lookup version is going to be slower but it is the perfect difference between the two…

Erik Darling: Well, it comes down to the query. If it’s like on a single primary lookup or something, then I don’t care if it does one key lookup for 40 columns, go ahead. It’s when there’s like thousands – hundreds of thousands of them, I’m like, no, not the key lookups.

Tara Kizer: But when you’re looking at the missing indexes and there’s a missing index with say, 48 includes, do you go ahead and recommend that – when it has a really high benefit per day, do you go ahead and recommend that index or do you chop off the includes for them?

Erik Darling: I vary a little bit on that. If I can track down the query that’s asking for it, and I look at this query and I say do you need this query as it lives and breathes and they say yes, we cannot change this query, we need this query as it lives and breathes then I say okay, then you need this index to make this query better. You can try it with just a key column first and see how that does or you can go for the whole kit and caboodle and see – you know, take the hit on having 48 includes in there. But if I can’t, then – if I can’t find that query and ask them those questions, then I’ll just give them the choice of you know, key columns plus includes and then just monitor how it gets used in BlitzIndex.

Richie Rump: When I start off with an index, it’s usually the key columns that I need and then the actual clustered key inside the includes and then that’s it. I put the clustering key in there just in case someone changes that later. It’s not going to do anything, right? And then as we start going on and hey, there’s a performance problem here or there, that’s when I’ll start throwing includes on top of that and not just let’s throw everything in there so I’ve gotten this really big index now for no reason. I’ve got no queries running up against it. It’s just taking up space, it’s on disk, it’s just taking up space in memory. Why is all this stuff there? So I don’t put included in there until I need it. That’s it.

Erik Darling: Alright, well we are – we have hit the 12:46 mark. I don’t see any other questions pouring in that I can answer so I think we should call it on Office Hours. What do you guys say?

Richie Rump: Yes, I’m good.

Erik Darling: Alright, cool. Take care everyone, thanks for showing up. We will see you next week. Goodbye.

First Responder Kit Release: Now With 100% More Working Code Than The Irish Space Program

Last Updated April 7, 2017

Erik Darling

First Responder Kit, SQL Server

Every April I think about this girl I grew up with named April who smelled like Cheez-Wiz.

She died tragically in a home for the blind on Taco Tuesday.

Special thanks to @jeffrosenberg , @mrthomsmith , @rwhoward for contributing this month. Your complimentary oxygen is now available. The rest of you can go download the First Responder Kit.

sp_Blitz Improvements

#762 This big dummy would crash on certain versions of 2008R2 when checking for memory dumps, ironically causing a memory dump. So hey, there’s that. Now we don’t do that for those versions. You’re welcome.
#824 Some files are bigger than others, and some sizes do matter. For instance, we only care if your tempdb files are more than 1 GB apart in size. Now you know, and knowing is half the burden,

sp_BlitzCache Improvements

#770 We now warn you if you use a Spatial index, since that probably means you accidentally created some Spatial data in SQL Server. I mean, who does that?
#796 sp_BlitzCache will now warn you if your code creates or drops tables and indexes while running. Yeah, who knew?
#801 We do our best to bubble up information from statements in stored procedures to their calling code. Because that’s the right thing to do. Why should those lazy statements get all the attention? They’re not special. Now we do that with memory grants, where available.
#808 Code is weird. Code that calls other code is weirder. Code that gets blocked is double weird, like a trailer full of twin sized mattresses. We now warn you if low cost queries have high CPU, or if long running queries have low CPU. Surely those are signs of trouble.
#816 Ah, zero. Is there anything you can’t do? Oh, be divided by. Yeah. Let’s not do that.

sp_BlitzFirst Improvements

#819 Biggie Smalls for mayor, the rap slayer, Hekaton say your prayers. Also get notified if any activity is happening by BlitzFirst.
#823 Funny story: no one ever found this error, because no one ever ran sp_BlitzFirst when they had a query being blocked for more than 30 seconds, which are EXACTLY THE KIND OF PEOPLE WHO SHOULD BE RUNNING sp_BlitzFirst, especially since this bug is fixed.

sp_BlitzIndex Improvements

#772 Mode 2 is now capable of being output to a remote server. Not available for other modes, or Depeche Mode. Thanks, @rwhoward!
#777 Cleanliness is next to Godliness, so I cleaned up this code to find your blasphemous computed columns based on scalar functions so you may be doomed for all eternity.
#778 Unused indexes are, like, a bummer, man. But they’re a bigger bummer when they have a lot of writes. Thanks to @amtwo for the totally badical suggestion. Air guitar and whatnot.
#779 I keep hearing about this SQL Server 2016 thing. I hope someday someone uses it, so they can use temporal tables, and marvel at the information that so_BlitzIndex provides them.
#711 If you had two copies of a database (like if you backed it up, then restored it under a different name), you might have been seeing double. Now, it’s not sp_BlitzIndex’s fault – it’s just due to your trifocals.
#827 Brent did a big round of performance tuning, and it’s now 3x-20x faster. That probably means he introduced a ton of bugs. Hey, It Just Runs Faster!®

sp_BlitzWho Improvements

#822 Sleeping transactions can violate the NAP by blocking other queries with their snoring so we can call our gold level sp_BlitzWho to arrest them. or just find them. Why weren’t we going this before? That was silly.

sp_DatabaseRestore Improvements

#765 We have the most elegant, luxurious storage for our LSNs. Really. Ask anyone. They’re NUMERIC(25,0).
#766 Similarly, we have the most elegant and luxurious storage for our file paths: NVARCHAR(MAX).
#775 Being compatible is important. Being backwards compatible means you have more bugs to fix. But hey, now you can find bugs if you’re on 2008 or later.
These were all done by @jeffrosenberg! A threefer!
#787 It’s not a stored procedure unless it starts with sp_ — thanks @mrthomsmith !

You can download the updated FirstResponderKit.zip here.

Announcing Our Summer/Fall Training Schedule and Instant Video Access

Last Updated April 5, 2017

Brent Ozar

Company News

I had a great time in San Diego over the last couple of weeks teaching classes. I’d love to be able to have every training class where we can do breakfast & lunch outdoors!

Performance Tuning Class, San Diego, March 2017

Buuut, we gotta move classes around, so here’s our upcoming lineup:

Performance Tuning (4 days, $3,995) – You’re a developer or DBA who needs to speed up a database server that you don’t fully understand – but that’s about to change in four days of learning and fun with Brent.

April 17-20 online (still 2 seats left!)
August 22-25 online
October 24-27 in Chicago
Learn more and register now

The Senior DBA Class (4 days, $3,995) – You’re a SQL Server DBA who is ready to advance to the next level in your career, but you aren’t sure how to fully master your environment and drive the right architectural changes.

April 11-14 online (still 3 seats left!)
August 28-31 online
October 16-19 in Chicago
Learn more and register now

And a new perk: instant video access. As soon as you enroll, you can start watching a recording of your class. Our Instant Replay access has been a huge hit with students – they’ve really loved being able to catch up with things they missed in class. I figured, “Why not let you prepare ahead of time, too?” So now, when you register, you can start streaming the videos right away. No need to wait for class for the learning to start, and you get access to the videos for 1 year (so you can prep before class, and revisit your learning after class.) There’s nothing else like it in the industry.

You can also catch me at upcoming conferences including SQLDay 2017 in Wroclaw, Poland and SQL Intersection in Orlando, Florida. See you around!

Announcing Our New DBA Job Interview Q&A Course.

Last Updated April 5, 2017

Brent Ozar

Interviewing

Whether you’re applying for a DBA job or getting ready to hire a DBA, you probably don’t know what questions to ask to tell if somebody really knows what they’re doing or not – or what answers to look for.

There’s the classic tired question, “Tell me the difference between a clustered and nonclustered index,” but after that, what?

In our new DBA Job Interview Q&A course, you’ll hear:

Heck – let’s just go ahead and get started with the Development DBA Questions module right now for free. (If you can’t see the video below, you’re reading this on an RSS reader or off-site – head over to the blog post.)

Wanna see the rest? It’s part of our DBA Subscription – normally $39/month, but use coupon code Relaunch2017 and get off at $19.50/month.

Announcing Monthly Training Subscriptions and Some Cool Perks.

Last Updated April 11, 2017

Brent Ozar

Company News

We’ve been selling training videos for a few years now, and we’ve figured out that we have two kinds of buyers:

Price-sensitive short-term customers – who are often paying out of their own pocket because their company won’t pay, and they want to learn as much as they can in a short period of time
Long-term enterprise-y customers – who submit their training via an expense report, and they just want all of our training, renewable every year

So to make things easier for everybody, we now offer 3 subscriptions:

DBA $39/mo – with DBA Job Interview Q&A, HA/DR, hardware, storage, virtualization.
Performance Subscription $49/mo – wait stats, indexing, query tuning, table statistics
Enterprise (Annual) – all of the above at a 43% savings.

But because you’ve been around here a while, you know that we launch stuff with a sale – and this one’s a doozy.

Knock 50% off your subscription for life with coupon code Relaunch2017 in April. Not just your first subscription – but 50% off that subscription for the whole life of it! It’s the gift that keeps on giving. Pick your subscription now.

Existing owners are getting a pleasant surprise in their emails today: if your subscription included any DBA class, then you’ve been upgraded to the entire DBA subscription through the life of your membership. Same thing with performance classes. Time to bust open that browser and get to learnin’!

If you’d like to extend your subscription, you can buy an extension this month (or upgrade to the Everything Bundle) and the time will be added on to your existing subscription. Enjoy!

[Video] Office Hours 2017/03/26 (With Transcriptions)

Last Updated April 1, 2017

Brent Ozar Unlimited Team

SQL Server, Videos

This week, Erik and Richie discuss truncating log files, reindexing, using sp_rename, query tuning, columnstore indexes, debugging parameter sniffing issues, query processing internals, and troubleshooting query performance issues.

Here’s the video on YouTube:

You can register to attend next week’s Office Hours, or subscribe to our podcast to listen on the go.

Enjoy the Podcast?

Don’t miss an episode, subscribe via iTunes, Stitcher or RSS.
Leave us a review in iTunes

Office Hours Webcast – 2017-03-26

If I configure a VM with local SSDs, and the host fails…

Erik Darling: Alright, James F – you have a question about encryption, I’m not the best on that stuff admittedly, I’m not much on security and certificates and what not. I would post that up on dba.stackexchange.com. James F asks another question, though, a follow-up – well, not a follow-up, a completely different question -“There is an article on your site that talks about tempdb on local host SSD with a SAN, how does that work with host failover?” Well, geez, that’s not really for VMs. That’s not a good VM idea. That is a good failover cluster idea. So if you’re on a failover cluster you can use local SSDs and when you know, tempdb fails over and is restarted, it’ll just find the new thing and be happy. With VM hosts is completely weird and different and I wouldn’t do it.

Would you automate log file truncation?

Erik Darling: Let’s see here, “Would you automate log file truncation and update growth for next extent in initial…?” – no. No, leave it alone. If you have to ask, leave it alone. I don’t like truncating log files.

Richie Rump: Why is that?

Erik Darling: Well, because they have stuff to grow again. When they grow, they don’t get the instant file initial – it’s an instant file initialize, you can tell I’m on cold medicine – Initialilazize…. So they have to grow and it takes time because they have to zero out all that space, so when you back up a transaction log, the space inside it is truncated. It’s the worst word that they could have possibly used for it because everyone pictures their transaction log shrinking to a magical size when they say your log file is truncated when you back it up, but that’s not it. So internally, space can be reused after you back up a log file. I don’t like truncating log files because they look – when they grow it can be painful. Leave them alone, and as they’re growing, database activity pauses while they grow because nothing else can get logged in there. It’s like oh hang on, we’ve got to grow the file 10GB, so you’re rebuilding a dumb index. Have fun. Alright.

Which index should I rebuild next?

Erik Darling: J. H asks, “Any good methods of anticipating which index to tend to ahead of time before a weekly re-index job and accommodating their needed spacings?” Jeez, well, what indexes usually get rebuilt? I mean, if you’re using Ola Hallengren that all gets logged to the command log table. If you’re using Minion, assume they have some sort of logging apparatus on there, so I would just look at which ones get rebuilt in their size. I’m not sure where you want me to go with that, if you meant something else please let me know.

Have you had issues with sp_rename?

Erik Darling: Joshua asks, “Have you used sp_rename for tables much and have you had issues using it? Just looking for experience points here.” Well, yes actually. I have a fairly old blog post, it was called the sp_rename follies and I got beat up in the comments about not using the GUI or something. Let me see if I can find it.

Richie Rump: The what? What is that? How do you spell that?

Erik Darling: It’s a French word and that means what’s on your screen, dummy.

Richie Rump: Oh wait, Goonies? That’s what my finger movie back in the day man, you should have said that in the first place, that’s amazing.

Erik Darling: It’s a good movie. So let me dump this into a – good enough for the chat window here for Joshua. So anyway, I had this thing with sp_rename where I would mess up and I would either put too many – I would put a DBO in where I didn’t need to and it would rename the table dbo.dbo, and it was dumb and I couldn’t rename it because something else had a lock on the table all of a sudden. I think it was probably like Intellisense looking at this table like did this dummy really do that – so anyway, that was my funny sp_rename folly, when I used it, I put in an extra DBO and I had a dbo.dbo.dbo.

Richie Rump: Yes, so I think the extent for my sp_rename stuff for tables was back in the day, I mean like, mid-2000s, maybe even earlier…

Erik Darling: Mid-2000s?

Richie Rump: Yes, we were doing like poor man’s replication, you know, just hopping data over because we didn’t want to deal with a lot of replication overhead, so as a reporting system we would just copy it over as a table name and then we would do a rename – sp_rename on the table that was in the database and then we would rename the new table actually to whatever the original table name was and do it that way. So that’s old school way of doing things.

Erik Darling: Yes, that’s almost like the weird ETL Schema switch thing.

Richie Rump: Yes.

Erik Darling: Let’s flop a new table in.

Richie Rump: Yes.

Erik Darling: I’ve always like tricks like that. I think they’re neat, but then I always like expect there to be some horrible drawback. Like I remember like – I think I did like Alter Schema switch or I did sp_rename and I remember like waking up in the middle of the night once thinking, but what if like statistics about the old table are caches? I was just like, but wait, what if this horrible thing happens and like I woke up the next morning after like finally drinking a bottle of whiskey and falling back asleep and I was just like, waiting for that email like whatever you did ruined the server, you are fired.

Richie Rump: So is that like an honest thing with DBAs? Waking up in cold sweat at three o clock in the morning like oh my gosh.

Erik Darling: Yes, like every time I would roll out a new job or a new thing to do, like at my last job where it was a relatively shop and I would do this thing where I would script out BlitzIndex to look for missing indexes because with relativity, you can add missing index requests and they don’t mind. They’re like, cool, you’re using our application in a different way, we respect that. So they let you as a DBA, add into indexes kind of to your heart’s content. You can do all sorts of stuff in there. So I would set up these things that would run sp_BlitzIndex and gather missing index requests from across all of the databases. At the time, BlitzIndex didn’t have to get all databases and it had to be run individually and dumped to a table or something. That was annoying, but I did it, it worked, then I would review the missing index requests and I would set up a script that would go through and add missing indexes. I remember when I was doing that, I was petrified and I was just like, every time that ran I would stay up and watch it and just like shake. Like, please don’t mess, no red text, no red text.

Richie Rump: You were Gloria Gaynor, so first you were afraid, you were petrified.

Erik Darling: I was, and I still am. I still wait to get a phone call about one of those scripts, like, hey man, remember that…

Richie Rump: No, no I don’t, I don’t at all, click…

Erik Darling: Nope, never heard of it, neither has my NDA, never heard of it, goodbye.

Ramblings about unit tests and feedback

Richie Rump: See, that’s the one thing I’m pretty happy what happened in the development space over the past ten years or so, is that now we have become so test focused and the test focus has absolutely been on repeating those tests and being able to write those tests and then run them at any given point, so now all the code we write at least here, we have unit test for and…

Erik Darling: Let’s tamper that statement. All the code that Richie writes has unit tests in stuff. All the code that we write has an AWS lab that may or may not meet our needs and you, the end user of our free scripts, are often a guinea pig for us.

Richie Rump: Oddly enough, that’s true, but we get your feedback and we integrate it into the product, so thank you very much.

Erik Darling: We will fix things based on that.

Richie Rump: Yes.

Erik Darling: I do like getting bug reports and stuff because then I have something to do when I’m bored, other than blog and annoy Richie with blog posts and the word blog.

Richie Rump: Yes, I don’t get bored around here. I don’t know what you’re talking about.

Erik Darling: No, I get bored.

Richie Rump: Yes, I know you get bored, but I don’t get bored. There’s so much stuff to do. I’m like why don’t you just come and help me Erik, and he’s like no, another blog post…

Erik Darling: Well, my problem is I only know Java, so I can’t help you.

Richie Rump: You can’t help anybody. You only know Java.

Erik Darling: I have my doctor Java app and I can write little Java things in there and that’s it. End of it.

Richie Rump: I spent the last 25 years of my career avoiding Java, so there you go.

Erik Darling: It’s not that bad, and it’s gotten better. I mean, 25 years ago, Java today is better than it was 25 years ago. Don’t be silly, Richie.

Richie Rump: Nope.

Erik Darling: Everything gets better – if SQL Server can get better over 25 years…

Richie Rump: Eat your Java beans please.

Erik Darling: I don’t know what that means. It’s probably funny to someone.

If a SQL Server gets more queries, does it get slower?

Erik Darling: Alright, let’s see here. “A user recently complained that some of their queries are taking longer than normal and timing out, looking at our monitoring tool, it looks like there’s more concurrent query activity but not much more can be done if there’s more traffic, correct?” Oh boy. Sounds like you have a…

Richie Rump: Is that correct?

Erik Darling: I mean, on the face of things, no, because you can always tune things more. There’s always a query or an index to be tuned. If it’s a third party app then no, probably not. The only thing you can do is throw some hardware at it, right? More RAM, more CPU, there’s something in there that can be done. I mean, you say not much, I assume that like you know, you’ve kind of thrown your hands up. If it’s a third party app, there’s hardware, if it’s your own internal app then you know, there’s query and index tuning and more hardware, so I wouldn’t say not much. I’d say there’s not much you can do today, but there is stuff that you can start testing and doing over time.

Richie Rump: I would say most things that I have run across with queries, the problem is with the query themselves. So there’s something in the structure of the query that can be adjusted a little bit, whether maybe it’s temp tables or CTEs or there’s something there, maybe a join or two that can be tweaked so it can make the query run faster, and it probably was done by a developer who just doesn’t have much experience with SQL and they just kind of went to joint city and just joined everything together.

Erik Darling: It’s true. That’s a big one, making sure that all your predicates can be searched on. You know, there’s a lot of stuff that I would keep looking at for that, but you know, that’s more than I could fit into a webcast I think.

Should I use columnstore indexes if I have constant inserts?

Erik Darling: Let’s see here. “I have a database, we’ll call it ProductionDB, it has small constant inserts but it’s a pseudo-reporting database. Can I expect, assuming I implement them smartly, large gains in performance by implementing column store indexes?” Jeez, if you’re inserting – I mean, are they a lot of single row inserts or are they lots of large inserts that you can tinker with? Because there is like a weird row number that you have to hit for inserts for column store to take them and make the delta stores and all that other compression thing, fancy language work correctly. If there’s lots of small inserts I would be hesitant to implement column store on it because as far as I know, column store still doesn’t handle writes terribly well, but I haven’t messed with it all that much.

Richie Rump: Sounds like a perfect candidate to test it in a…

Erik Darling: It does sound pretty good. I mean you know, from what I have messed with, column store is pretty rocking technology. The compression is awesome, make your data teeny tiny, lots of aggregation queries are pretty sweet, and you know, if your queries are fairly selective you can get lots of good row and segment elimination – segment elimination I believe it’s called, fancy word for it. So there is lots of great stuff about column store. Whether it’s going to work in this specific use case with the inserts I am not sure.

Richie Rump: Try it, let us know. We would love to find out.

How much memory is SQL Server using?

Erik Darling: Let’s see here. John – someone named John, John Smith, get in the squad car, ask another question. “I’ve assigned 28GB of memory to a SQL server. Task manager of course said it’s using 28GB. Task manager is not reliable for that, how do I tell how SQL server is using those 28GB? Like x percent are going to this thing and y are going to that thing.” There are all sorts of cool system DMVs where you can see what’s in memory. I think there’s like sys.dm buffer memory descriptors or something like that, you can figure out what tables and what indexes are in memory, you can query the plank cache and figure out how many GBs or MBs of plant you have in memory. So there is all sorts of interesting stuff you can look at. I would just check out the DMVs for that. I would just read up on the memory DMVs and figure out which ones are in there. There might be a query somewhere on a blog, I just don’t know – Brent would know if that was on there offhand now. Not that good.

How do you debug parameter sniffing issues?

Erik Darling: In general, how do you start debugging parameter sniffing issues? Well, the first thing I do is I make a temporary stored procedure, I don’t test with local variables and then I figure out what runtime parameters are and what compile parameters are and then I figure out what a plan would look like with each of them. If that’s the same then I start looking for outliers, so I might go into my data and I might start looking for certain values that have a big skew to them. By that I mean if there’s one type of value that has millions and millions or just way more rows than other values in general then I start comparing the plan that that gets by itself – the plans that smaller values get by itself. The classic example is if you have a sales report between Rhode Island and Texas, or if you have the sales reports or procedure you call Rhode Island, it’s going to get one plan, if you call Texas, it’s going to get a different plan. Or, it should get a different plan. It may need a different plan because Rhode Island’s plan would not be sufficient for it. If you call Texas first, you’ll get the big Texas plan and then Rhode Island will reuse that and that might be good or bad, because you might be spending up a lot of additional resources to help out Rhode Island. So there are a lot of ways to go with that, but in general I start looking at common variables to get passed in and if you know, isolated, they get different plans and what those plans are – it could be an indexing thing where like, you know, SQL is too like a weird key lookup plan for a smaller variable because that makes sense, and then when a large variable comes in and starts doing billions and billions of key lookups and that doesn’t make sense for the larger variable, so that could be an index thing where if you can tinker with an index, you’ll take away the key lookup plan as a choice for the large and small variables.

Richie Rump: It sounds like you need to do a little bit of data profiling.

Erik Darling: Little bit.

Richie Rump: When you’re trying to find these parameter sniffing, so if you’re unfamiliar with data profiling, there’s some pretty decent tools out there for it. I wrote a small script that I haven’t touched in a couple years that kind of gets you 75% of the way there, 70% there, called SP Data Profile, but you want to take a look at your data, take a look at your skew and then do all the stuff that Erik just said.

Erik Darling: No, no one’s paying attention there anyway. I feel bad – our transcriptionist is going to fall asleep. Late-stage Groucho Marx is falling asleep.

What causes a query plan to change?

Erik Darling: Nathan asks, “What causes a query plan to change? If all of the variables going in are the same, just the values are different, could auto stats trigger new plans that seem to not be as good?” Well that sounds a lot like parameter sniffing too – could be. So if you’re talking about a stored procedure that takes the same three variables and you pass in the same three variables but then execution plans are different. Well yes, it sounds like auto stats could have updated and invalidated an execution plan and then a weird set of variables may have gotten passed in and you may have gotten one plan and that plan doesn’t make sense for anything else. That could be one thing. If that plan uses temp tables, there’s a recompile threshold with temp tables and store procedures. Paul White wrote the banging-est blog post about it like a decade ago – it’s called ‘Temp Tables and Stored Procedures” if you want to check that out. Just stored procedure uses temp tables, after a temp table changes enough times that something recompiles and you get a different plan and statistics have updated so all sorts of stuff happens, so there’s a lot to look at with that.

Richie Rump: What about an index rebuild, would that trigger it as well?

Erik Darling: Yes, because an index rebuild does update statistics. A lot of other stuff – if there are a lot of table definition changes, you know, stuff like that. If the plan gets thrown out of cache for some reason, like memory pressure, next time you go and run the stored procedure it will have to come up with a new plan and that new plan may have been based on some weird set of variables. I would look into – I guess you can’t really look into that; either it happened or it didn’t.

What’s the best way to find unused indexes?

Erik Darling: Richard asks “what’s the best way to identify unused indexes? What length of time should they be unused before you consider them for delete?” Well, Richard, you know where you are, right? You know that you’re on the Brent Ozar Unlimited Office Hours and that we publish a set of scripts called the First Responder Kit. One of those scripts is called sp_BlitzIndex, and sp_BlitzIndex tells you when you have unused indexes, and that’s a really great place to start. For free.

Richie Rump: For free, and to be honest, those are the scripts that I used first, before I came here.

Erik Darling: I was like dismally familiar with them. Let’s see here, “what length of time should they be unused before considering them for delete?” Well, that’s why you don’t delete them, you just disable them. So then if it comes up that you needed one of those indexes, it’s a lot easier to go back into the GUI and right click and hit rebuild than it is to figure out which index definition you’ve lost entirely. What length of time kind of depends on your application, you know, you could have that index that gets used once a month for an end of month processing report or quarterly for some vice-president report that you might absolutely need but might look unused the rest of the time. So when I start looking at that stuff I like to run BlitzIndex weekly and then look at things and what I like to look for, over longer periods of time than a week, is writes versus reads. So for unused indexes you’re always going to see – things that get flagged as unused indexes in sp_BlitzIndex are always zero reads, but I like to look for ones that also have high writes, because if they have zero reads and zero writes I just don’t care, they can sit there and take up metadata space and not do anything, it doesn’t really bother me much, especially because I don’t rebuild indexes so it doesn’t matter, but if they have really high writes and no reads then I start getting concerned because that’s slowing me down. That’s hurting I/O, that’s hurting modifications, inserts, updates and deletes but it’s not really helping any re-query. So more than any length of time, I like to look at how many writes have accumulated on our index; that’s a good way to go.

What names should Erik and Richie use next week?

Richie Rump: So here’s another question, “next week Erik should be named Sloth and Richie should be named Chunk, or the other way, whatever you guys want, fight over it…” I call Data, but thank you, that’s a good suggestion. Sloth over here is probably…

Erik Darling: I want to take Mouth.

Richie Rump: Mouth…

Erik Darling: He was my favorite.

Richie Rump: That actually is pretty accurate.

Erik Darling: That scene where he’s talking to the maid in Spanish about where the drugs are. That still makes me laugh.

When will I see a GroupBy.org presentation from you?

Richie Rump: Here’s another question, “when will I see a GroupBy.org presentation from one of you fine gentlemen?”

Erik Darling: Well not the next one around because those presentations already got picked. I was sort of intentionally not throwing my hat in the ring for the first couple because I really wanted other people, I didn’t want it to seem like the Brent Ozar Unlimited show for GroupBy, but Brent is actually actively encouraging us to submit for future sessions, so probably the next one.

Richie Rump: Yes, I can’t wait until all of you guys hear about the intricacies of Javascript promises compared to callbacks, because I’m sure all of you wanted to hear about that.

Erik Darling: Yes, tell me more about sorting arrays efficiently.

Richie Rump: Who does that anymore man?

Erik Darling: I don’t know…

Richie Rump: Let the CS101 guys figure it out.

Erik Darling: I’m basically a CS101 guy, that’s my gig.

Should I shrink my database if…

Erik Darling: Let’s see here, “is there a reason to shrink if you have a non-normal operation that made it grow in the first place?” Well, I’m going to – yes, go ahead and shrink it down. Just truncate the empty space. I assume you’re talking about your data file or your log file, I assume you’re not talking about anything personal. So what I would do is just get rid of the empty space, there’s a truncate only thing you can do where you just get rid of the empty space; that’s usually the safest thing to do. If you – see whenever people talk about the bad stuff that shrinking a file causes, it’s like when you shrink a file like way beyond just what’s empty, when you start mashing stuff things get really bad. Lately when I’ve tested it and I’ve looked at things and I’ve just been getting – like when I do demo stuff and I blow up stuff by doing things then I have all this stuff in my overflow data file that just isn’t used by anything, I’ll truncate the empty space in 1GB or 5GB increments, and then like if I check the next fragmentation after that, like oh, this would make a good blog post, but I haven’t seen anything from fragmentation like that in a while so…

What headsets and microphones do you guys use?

Richie Rump: Meanwhile I’m in derailing mode, so Erik, which make/model headset and microphone are you using? Pros and cons…

Erik Darling: I don’t know, Brent sent it to me… An Audio Technica BPHS1, and it is hooked into, what’s it called, a Scarlett Focusrite?

Richie rump: Focusrite Scarlett 2i2 – I think that’s pretty standard what we’ve got here.

Erik Darling: It has the one jack, not the two jacks.

Richie Rump: Oh, so you’ve got the Focusrite Scarlett Solo?

Erik Darling: Yes, well it has the big jack and then the little microphone jack and the knob and I can hear myself talk too much, and it has a guitar thing on it, so I guess if I wanted to play the guitar I could do that too.

Richie rump: Yes, that’s the Solo. The audio interface that I’m using is the Focusrite Scarlett 2i2, has two audio inputs; never use them, only need one because there’s only one of me, and I’m also using the Audio Technica ATH-50X headphones with Memory Foam inserts as well as a special skin, you can see it right there, because I’m vain like that, and I’m also using an Audio Technica 2005 microphone which is a USB and XLR. One of the kind of flexibilities of USB or XLR, I think I’ve kind of outgrown that and I only need an XLR because I’m using the Scarlett so much, but if I’m ever on the road and I need a USB, it’s kind of there. So that’s kind of my gear and how we’ve got the setup.

Erik Darling: And that was my chance to tune out. Thanks, Richie.

Richie Rump: Everybody except that one guy is like “what?”

Erik Darling: Good time, good time.

What’s your favorite learning resource for internals?

Erik Darling: Alright, Germal asks a fun learning question. I like this one because I don’t have to get too specific and it’s okay if I NyQuil this one a little bit, “what’s your favorite resource to learn about query processing internals and what do you recommend for getting better at troubleshooting query performance issues?” Well for me, Paul White’s blog posts – when you start reading them they will be a million years beyond your reach, but as you mature they will start to make more sense. Ditto Adam Machanic’s blog posts, they both write some rocket science-y stuff. As far as other stuff goes, the SQL server internals books by Kalen Delaney at all are good. there are optimizer chapters written by Connor Cunningham who used to own the query optimizer at Microsoft and the 2012 and 2008 are two versions of that, pretty much the same so I would just stick to the 2012 one; those are my favorites, those are my crazy people.

Richie Rump: I think I started my journey just kind of paying attention to what people were writing over Twitter and posting on Twitter. A lot of, oddly enough, Brent Ozar stuff – I Don’t know why you’d listen to that guy… Then I started attending a lot of SQL Saturdays and started presenting to a bunch of SQL Saturdays about project management. That’s kind of how I got started, I was talking about project management at SQL Saturdays so I could go in and I could meet the speakers the night before and then meet and hear all the database stuff kind of afterwards, and Summit is a great place I think to learn a whole bunch of different stuff and meet a bunch of different people. So that’s kind of how I came into the whole database stuff, and now I don’t even touch it at all. Thanks, guys, back to my developer roots…

Erik Darling: The gateway to query processing knowledge really is learning about execution plans. So the more you can learn about execution plans the more you can pry into…

Richie Rump: Oh and Grant has a good book on that too, I believe.

Erik Darling: He does.

Richie Rump: Yes I’ve read that one, that was a good one.

Erik Darling: Yes, Grant has a good book on – Grant Fritchey has a book on execution plans, and a long time ago a guy named Fabiano Amorim published a series for Simple Talk where he talked through common execution plan operators, and I used to love reading that over and over again.

Richie Rump: I remember that, that was pretty good.

Erik Darling: yes, it was like 14 or 15 different things about all different query plan operators and what really impressed me so much about those was that this guy could write queries that would produce certain operators, and that’s when it started to make sense to me, like oh no, there’s a logic to this. Like when I do certain things, certain operators are going to pop up and this is why. That was one of the first like light bulb SQL moments I ever had, reading through this like “you can make it do that, that means it does that reliably based on things!” And that was lovely for me.

Richie Rump: So my first ah-ha in SQL server was I was at SQL Saturday Orlando and I sat through an Adam Machanic hour-long session and I didn’t get lost. Right, so this was like a 400 level session, and I’m sitting there and I’m hanging in there and hanging in there, I’m like I got it, I got it, and then at the end it was like “I understood that!” Light bulbs started going off, fireworks behind my head, I’m like wow…

Erik Darling: Do you remember which talk it was?

Richie rump: No, I have no idea. I’m sure I could Google back and figure it out.

Erik Darling: One of my favorite talks ever by him was – it’s on Youtube, it was a TechEd conference talk and it’s called like the Top Five execution Plan Problems That You Can Fix or something like that, and it was like watching – like I’m with you on that…

Richie Rump: Was that the crazy ones where he goes, you know, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10?

Erik Darling: Yes…

Richie Rump: It may have been the same one.

Erik Darling: Yes, and then like all of a sudden it’s parallel and there’s a distinct and the averages and you’re like “I get it, that made sense to me, I can go home and reproduce that!” But yes, Adam Machanic and Paul White are two of the main guys.

What are you learning next?

Richie Rump: I know we’re a little bit over time but let’s get one more, “what are you each learning now for your next blog posts?”

Erik Darling: Learning? Oh boy – I’ve been writing and rewriting a lot of training material for some upcoming videos that are going to be part of our cool subscription re-launch. So I’ve been going over training materials on sargability, dynamic SQL, filtered indexes, computed columns, common table expressions, tons of stuff like that. I don’t know how much of it is going to make it to blog posts because I do still have to record it all, but we’ll see what happens.

Richie Rump: Yes, I’ve been doing a ton of server-less and Javascript and that type of stuff, so that probably is not going to make it to our blog at least, BrentOzar.com. I don’t really write so much because I’m too busy doing stuff, but I did start a blog post series on Entity Framework, kind of for the DBA and kind of like the top five or six things that you can do to kind of optimize and find problems in any framework queries. So I’m hoping to finish that soon, but I started that last year. So we’ll see how much I can get done, because as opposed to Erik, I’m not bored here.

Erik Darling: I’m not actually bored, sometimes it’s just weird to be able to choose what you want to do sometimes. Like you have a mission, you have a dedicated purpose…

Richie Rump: I’m on a mission. This is a mission, not a smalltime thing…

Erik Darling: You know, like when I have free time, in other words when I’m not on client time and I get to pick what to do, I’m like First responder Kit, blog, new training materials. There’s like all these directions that I get pulled in.

Richie Rump: Someday I’m going to write a video training and it’s just going to be how to live with Erik, and it’s going to be like an hour or two hours of just that.

Erik Darling: I can’t wait for my wife to watch that.

Richie Rump: I’m mainly doing it for your kid, do…

Erik Darling: Oh she doesn’t know, she barely knows I’m here.

Richie Rump: She will, she will figure out that you’re in a closet, okay.

Erik Darling: She will. Well, it’s not that closet-y, this is like a separate zip code in New York.

Richie rump: It’s closet-y.

Erik Darling: You’re closet-y. We should go – goodbye, thank you for showing up. Wish me well because I’m sick, goodbye.

Wanna join us at the next Office Hours to ask a question? Register here.

Checking for Strange Client Settings with sys.dm_exec_sessions

Last Updated April 5, 2017

Brent Ozar

Configuration Settings, Indexing

In my performance training class, people get really excited when I cover filtered indexes, computed columns, and indexed views.

Then they get a really strange expression on their face when I talk about one of the biggest drawbacks: if your connection options aren’t set correctly, your delete/update/insert (DUI) operations will fail. That gave me a really fun time once when I implemented an indexed view, and they called me back the next day screaming because their app was down. My bad. (sigh)

To use these features, NUMERIC_ROUNDABORT has to be set OFF, and ANSI_NULLS, ANSI_PADDING, ANSI_WARNINGS, ARITHABORT, CONCAT_NULL_YIELDS_NULL, and QUOTED_IDENTIFIER must all be set to ON.

People just kinda look at each other and go, “Uh, how am I supposed to know what connection options people have?”

That’s where sys.dm_exec_sessions comes in. For all current sessions, it shows everything except NUMERIC_ROUNDABORT. You can run a query like this every minute via an Agent job, log it to table, and then go back to see if any culprits were found:

SELECT *
FROM sys.dm_exec_sessions
WHERE is_user_process = 1 AND
(ansi_nulls = 0
OR ansi_padding = 0 
OR ansi_warnings = 0 
OR arithabort = 0 
OR concat_null_yields_null = 0 
OR quoted_identifier = 0)

SELECT *

FROM sys.dm_exec_sessions

WHERE is_user_process = 1 AND

(ansi_nulls = 0

OR ansi_padding = 0

OR ansi_warnings = 0

OR arithabort = 0

OR concat_null_yields_null = 0

OR quoted_identifier = 0)

Ta-dah:

For bonus points, you can filter on database_id, but just be aware that someone can be in a system database like MSDB and still do fully qualified DUI operations against a user database.

Decrypting Insert Query Plans

Last Updated March 25, 2017

Erik Darling

Execution Plans, SQL Server

We’ve all been there

Staring at a simple query that has a three mile long execution plan. After the first time, you learn to always look to make sure there’s not a secret view involved.

After looking at query plans for modifications involving indexed views, I decided to look at some other fairly common table design items that cause execution plan oddities. I’m focusing on INSERTs here, because… well, because I feel like it. Maybe UPDATEs and DELETEs will be other blog posts. I hate making promises like that, because then when I sit down to write them, I don’t find anything all that interesting.

Various defaults

We’ll look at two kinds of defaults here. Ones that rely on outside functions to supply a value, and hardcoded values. For outside functions, I picked a few of the obvious ones: Identity, Sequences, NEWID(), and NEWSEQUENTIALID().

We have to create a Sequence here, which only works on 2012+ — the definition isn’t all that important.

IF  (OBJECT_ID('crap')) IS NOT NULL
BEGIN
DROP SEQUENCE dbo.crap

CREATE SEQUENCE dbo.crap 
AS BIGINT
START WITH 1
INCREMENT BY 1
NO CYCLE
CACHE 10000
END

CREATE TABLE dbo.ident_crap (Id BIGINT IDENTITY (1,1), cruft BIGINT DEFAULT 9223372036854775807)

CREATE TABLE dbo.sequence_crap (Id BIGINT DEFAULT NEXT VALUE FOR dbo.crap, cruft BIGINT DEFAULT 9223372036854775807)

CREATE TABLE dbo.newid_crap (Id UNIQUEIDENTIFIER DEFAULT NEWID(), cruft BIGINT DEFAULT 9223372036854775807)

CREATE TABLE dbo.newseqid_crap (Id UNIQUEIDENTIFIER DEFAULT NEWSEQUENTIALID(), cruft BIGINT DEFAULT 9223372036854775807)

INSERT dbo.ident_crap ( cruft )
DEFAULT VALUES   

INSERT dbo.sequence_crap ( Id, cruft )
DEFAULT VALUES   

INSERT dbo.newid_crap ( Id, cruft )
DEFAULT VALUES   

INSERT dbo.newseqid_crap ( Id, cruft )
DEFAULT VALUES   

DROP TABLE dbo.ident_crap, dbo.sequence_crap, dbo.newid_crap, dbo.newseqid_crap

IF (OBJECT_ID('crap')) IS NOT NULL

BEGIN

DROP SEQUENCE dbo.crap

CREATE SEQUENCE dbo.crap

AS BIGINT

START WITH 1

INCREMENT BY 1

NO CYCLE

CACHE 10000

END

CREATE TABLE dbo.ident_crap (Id BIGINT IDENTITY (1,1), cruft BIGINT DEFAULT 9223372036854775807)

CREATE TABLE dbo.sequence_crap (Id BIGINT DEFAULT NEXT VALUE FOR dbo.crap, cruft BIGINT DEFAULT 9223372036854775807)

CREATE TABLE dbo.newid_crap (Id UNIQUEIDENTIFIER DEFAULT NEWID(), cruft BIGINT DEFAULT 9223372036854775807)

CREATE TABLE dbo.newseqid_crap (Id UNIQUEIDENTIFIER DEFAULT NEWSEQUENTIALID(), cruft BIGINT DEFAULT 9223372036854775807)

INSERT dbo.ident_crap ( cruft )

DEFAULT VALUES

INSERT dbo.sequence_crap ( Id, cruft )

DEFAULT VALUES

INSERT dbo.newid_crap ( Id, cruft )

DEFAULT VALUES

INSERT dbo.newseqid_crap ( Id, cruft )

DEFAULT VALUES

DROP TABLE dbo.ident_crap, dbo.sequence_crap, dbo.newid_crap, dbo.newseqid_crap

Tables, right? Am I right, people? Tables. What happens when we insert stuff?

The INSERT plan generates two Compute Scalar operators. One of them is simply sticking the default value in, but the other calls a function called getidentity, which calculates the next value in the identity column. I mean, I’m guessing that it does that. If it does something else, well, there goes my credibility!

Now, just hovering over the Compute Scalars doesn’t tell you anything about them. You have to highlight the one you’re interested in and hit f4 to bring up the Properties window over on the right side of SSMS. That’ll get you something that looks about like this.

I’m only showing this one once, because it’s the same every time it shows up. If you really need to verify that, play along with the code at home.

You’re fun looking. Check out all those parenthesis.

Inserting into the table with the Sequence generates the same exact two Compute Scalar plan, but naturally it calls the Sequence we defined to pull the next value instead of getidentity.

Now, this one threw me off a little bit. The plan is simplified to two operators.

Even though it still calls a function to generate the NEWID() value, and produces a scalar value for the default value, the query plan for some reason chooses not to tell us about them. I’ll have to do some other testing on why this is.

The NEWSEQUENTIALID() insert plan is back to the same ‘wide’ plan as the others. And, as is expected, it references that function on insert.

That hopefully clears up some of the more mundane reasons for additional query plan operators during INSERT. Not every Compute Scalar is an awful udf — though they can be that too, and sp_BlitzCache will warn you about that.

More interesting

Two other constructs that can cause additional operators are check constraints and foreign keys. Did you know that SQL Server doesn’t honor foreign keys on temp tables? That seems like something I should have known, but I guess I’d never tried it before. When I set up this demo with temp tables, I got this message and had to change a bunch of stuff.

Skipping FOREIGN KEY constraint ‘idfk’ definition for temporary table. FOREIGN KEY constraints are not enforced on local or global temporary tables.

Anyway, let’s start with a simple check constraint.

CREATE TABLE dbo.crap_ck (Id BIGINT IDENTITY(1,1) PRIMARY KEY CLUSTERED, cruft BIGINT NOT NULL CHECK (cruft > 0 AND cruft < 11) )

INSERT dbo.crap_ck ( cruft )
    VALUES ( 1 )

DROP TABLE dbo.crap_ck

CREATE TABLE dbo.crap_ck (Id BIGINT IDENTITY(1,1) PRIMARY KEY CLUSTERED, cruft BIGINT NOT NULL CHECK (cruft > 0 AND cruft < 11) )

INSERT dbo.crap_ck ( cruft )

VALUES ( 1 )

DROP TABLE dbo.crap_ck

This plan is a bit more interesting. We get a Constant Scan, which is SQL Server’s way of telling us it’s creating one or more rows to do something with later. I’ll give you a bonus plan with a more interesting use for them in a minute.

The check constraint logic is housed in the Assert operator.

This is the thing that will throw an error if you try to insert an invalid value.

It’s always a CASE expression, isn’t it?

So, that more interesting use of Constant Scan? It can crop up during a phase of query optimization called contradiction detection. That’s when SQL Server tells you that your logic is broken and doesn’t even really bother making a fully formed querry babby. It can kick in for a whole bunch of reasons, but here are a couple examples.

SELECT *
FROM dbo.crap_ck AS cc
WHERE cc.cruft > 11
AND 1=1

SELECT *
FROM dbo.crap_ck AS cc
WHERE cc.Id = 1 AND cc.Id = 10
AND 1=1

SELECT *

FROM dbo.crap_ck AS cc

WHERE cc.cruft > 11

AND 1=1

SELECT *

FROM dbo.crap_ck AS cc

WHERE cc.Id = 1 AND cc.Id = 10

AND 1=1

The 1=1 is there to avoid getting a trivial plan, which ruined my demo at first. The 1=1 (for now) prevents SQL from creating and using a trivial plan for our query. I learned that trick from Kendra who learned it from Paul White. It’s pretty neat, and it makes SQL Server really think about what it’s doing. Upon thinking, it realizes neither of those two queries will return anything, so it shortcuts doing a whole bunch of work just to return 0 rows.

You’ve come a long way, baby

So here’s what foreign keys can do to query plans.

CREATE TABLE dbo.crap_pk (Id BIGINT IDENTITY (1,1) PRIMARY KEY CLUSTERED, cruft BIGINT NOT NULL)

CREATE TABLE dbo.crap_fk (Id BIGINT IDENTITY (1,1) PRIMARY KEY CLUSTERED, cruft BIGINT NOT NULL, Id_fk BIGINT NOT NULL CONSTRAINT idfk FOREIGN KEY REFERENCES dbo.crap_pk(Id))

INSERT dbo.crap_pk ( cruft )
SELECT 1

INSERT dbo.crap_fk ( cruft, Id_fk )
SELECT 1, 1

DROP TABLE dbo.crap_fk, dbo.crap_pk

CREATE TABLE dbo.crap_pk (Id BIGINT IDENTITY (1,1) PRIMARY KEY CLUSTERED, cruft BIGINT NOT NULL)

CREATE TABLE dbo.crap_fk (Id BIGINT IDENTITY (1,1) PRIMARY KEY CLUSTERED, cruft BIGINT NOT NULL, Id_fk BIGINT NOT NULL CONSTRAINT idfk FOREIGN KEY REFERENCES dbo.crap_pk(Id))

INSERT dbo.crap_pk ( cruft )

SELECT 1

INSERT dbo.crap_fk ( cruft, Id_fk )

SELECT 1, 1

DROP TABLE dbo.crap_fk, dbo.crap_pk

Ready for this? It goes and looks at the whole other table. Teehee.

This is why people tell you to index your foreign keys

The magic is in the Join and the Assert.

Again, this is what throws an error when your foreign keys don’t line up. Yeehaw.

Insertresting

Next time you look at an Insert query plan that has all sorts of horrible things going on in it, I hope you think of me. And I hope this blog post helps you figure out what all is going on with them.

Thanks for reading!

CTEs, Views, and NOLOCK

Last Updated March 25, 2017

Erik Darling

Humor, SQL Server

This is a post because it surprised me

It might also save you some time, if you’re the kind of person who uses NOLOCK everywhere. If you are, you’re welcome. If you’re not, thank you. Funny how that works!

I was looking at some code recently, and saw a CTE. No big deal. The syntax in the CTE had no locking hints, but the select from the CTE had a NOLOCK hint. Interesting. Does that work?!

Oddly, it does. It also appears to be true for views.

Let’s construct a silly test!

In the red corner, weighing in at 5,899 rows, we have an uncommitted update.

BEGIN TRAN;

UPDATE u
    SET u.Reputation += 100
    FROM dbo.Users AS u
    WHERE u.Id BETWEEN 1 AND 10000;                  

ROLLBACK;

BEGIN TRAN;

UPDATE u

SET u.Reputation += 100

FROM dbo.Users AS u

WHERE u.Id BETWEEN 1 AND 10000;

ROLLBACK;

In the blue corner, weighing in at 100 rows, is a CTE.

WITH    c AS ( SELECT TOP 100 u.DisplayName
                FROM dbo.Users AS u
                JOIN dbo.Posts AS p
                ON  p.OwnerUserId = u.Id)
     SELECT *
        FROM c WITH ( NOLOCK );

WITH c AS ( SELECT TOP 100 u.DisplayName

FROM dbo.Users AS u

JOIN dbo.Posts AS p

ON p.OwnerUserId = u.Id)

SELECT *

FROM c WITH ( NOLOCK );

If we run that query as-is, it finishes about as instantly as you’d expect. If we run it without the NOLOCK hint, it gets blocked as you’d expect. Running sp_WhoIsActive with @get_locks = 1, we can see the locks taken out by the update.

In this case, you can see the PK/CX is locked, along with a couple nonclustered indexes I had for another demo. They had Reputation in the definition, so they got locked by the update modifying Reputation.

As the title suggests, this also applies to views

Here’s the CTE as a simple view. Again, no hints in the view itself.

CREATE VIEW dbo.Top100Nonsense
AS
       SELECT TOP 100 u.DisplayName
        FROM dbo.Users AS u
        JOIN dbo.Posts AS p
        ON  p.OwnerUserId = u.Id;
GO

CREATE VIEW dbo.Top100Nonsense

SELECT TOP 100 u.DisplayName

FROM dbo.Users AS u

JOIN dbo.Posts AS p

ON p.OwnerUserId = u.Id;

Selecting data from the views behaves the same way.

This one finishes instantly!

SELECT *
FROM dbo.Top100Nonsense AS tn WITH (NOLOCK)

1 2	SELECT * FROM dbo.Top100Nonsense AS tn WITH (NOLOCK)

This one hangs on for dear locks.

SELECT *
FROM dbo.Top100Nonsense AS tn

1 2	SELECT * FROM dbo.Top100Nonsense AS tn

Are there any differences?

The only difference is which objects the select queries are waiting for locks on.

The select from the view has an additional granted request on the view name as well.

The CTE doesn’t care about the view, obviously.

Now with less nolock

So, for all you NOLOCKers out there, you can now save yourselves oodles of time by only using the hint in outer references to your CTEs and Views.

Congratulations, I suppose.

(Please stop using NOLOCK.)

Thanks for reading!

Confession: I recommended that a 5GB database get split up.

Last Updated April 5, 2017

Brent Ozar

Licensing, Microsoft Azure, SQL Server

Listen, I can explain. Really.

We had a client with a 5GB database, and they wanted it to be highly available. The data powered their web site, and that site needed to be up and running in short order even if they lost the server – or an entire data center – or a region of servers.

The first challenge: they didn’t want to pay a lot for this ~~muffler~~ database. They didn’t have a full time DBA, and they only had licensing for a small SQL Server Standard Edition.

At first, that sounds like a perfect fit for Azure SQL DB. For a relatively small monthly cost, they could have Microsoft manage their high availability and disaster recovery. Want more replicas in more data centers? Just turn the knob, and your credit card bill gets warmer. RPO/RTO becomes the business question it was always meant to be – if you wanna be more available, you just change your billing level, and there’s no implementation project or staff training.

But there was a catch.

During due diligence, we unearthed another tricky challenge: a lot of the data was regenerated from scratch, every single day, during overnight ETL jobs. This regenerated data was projected to grow between 10x to 1,000x in the next two years.

50GB in a single Azure SQL DB isn’t a big deal.
50GB of data regenerated every night, however, is.
And 5TB of that is simply untenable.

See, Azure SQL DB isn’t just priced by storage capacity and availability: it’s also priced by performance capacity. The more data you churn, the more cash you burn.

So we asked a question: which parts of the data really need high availability?

In this case, the data that was constantly churned was only for internal reporting purposes. It didn’t drive the web site. It could have a completely different RPO/RTO goal.

And the mission-critical data that drove the web site? It was actually fairly compact and stable, less than 1GB of data altogether.

The answer was to split the 5GB database into two parts:

A small core subset of mission-critical data living in Azure SQL DB
A skyrocketing but lower-value majority of data living on-premises in their already-paid-for SQL Server Standard Edition box, with easy-to-handle HA/DR technologies designed for less ambitious RPO/RTO

Here’s how you can make these same decisions.

In our First Responder Kit, we’ve got a set of worksheets, scripts, and e-books that I teach you how to use during my Senior DBA Class. The RPO/RTO worksheet helps you and the business users determine your high availability goals:

After filling that out, if you need to buy new hardware (or rent new cloud space), then you use the hardware sizing worksheet:

You gather the business users and IT teams into a conference room, put this up on the projector in Excel, and talk them through each of the questions. You record everyone’s answer, and then come to a consensus.

Then, armed with these answers, you can build a better long term architecture design.

And yeah, sometimes that means splitting a 5GB database into two parts.

[Video] Office Hours 2017/03/22 (With Transcriptions)

Last Updated March 27, 2017

Brent Ozar Unlimited Team

SQL Server, Videos

This week, Tara, Erik, and Richie discuss statistics updates, in-place upgrades, row vs page compression, deadlocks, replacing SQL backup vendors, server load, ETL, and whether they think that database mail should be used by user applications to send their emails.

Here’s the video on YouTube:

You can register to attend next week’s Office Hours, or subscribe to our podcast to listen on the go.

Enjoy the Podcast?

Don’t miss an episode, subscribe via iTunes, Stitcher or RSS.
Leave us a review in iTunes

Office Hours Webcast – 2017-03-22

Can I performance tune statistics updates?

Erik Darling: All right, it’s 12:16 now, a minute late. We might have to go a minute over to answer all you people’s questions. Let’s just start with the top here. Oh, this is a fun question. “This might seem backwards, but is it possible to performance tune a statistics update with full scan?”

Tara Kizer: No.

Erik Darling: No. What’s even kind of crappy about that statistics update is I don’t even think there is a sort in tempdb option like there is for indexes. So, you are just stuck doing all that work.

Tara Kizer: The only thing that I would add, do you actually need to do a full scan? On the systems that I have supported, the number of rows directly pertains to what sampling you might need. As the table gets bigger, you might even get away with ten percent. I have even done one percent sampling on tables that were say were a hundred million rows or over.

Erik Darling: So, what I would say is rather than try to have a giant full scan on a full table’s worth of statistics, I would try creating some filtered statistics on the values that you are interested in. Perhaps if there is a certain range of predicates that has an abnormal skew in your data, I would try creating a filtered, or maybe a few different filtered statistics, on the skewed data to see if that helps at all. Because that way, if you do have to update that particular statistic with a full scan you are dealing with a smaller amount of data or something like that. Something reasonable, right? A reasonable assumption, smaller amount data than the entire table. Who knows though, who knows?

(Note: SQL 2016 has some tricks where you can update multiple stats on the same table at the same time, but that’s left up to you and your simultaneous scripting. Good luck with that.)

Do in-place upgrades upgrade everything?

Erik Darling: Let’s see here, we have a question about upgrading, in place upgrades, heaven forfend. “When upgrading SQL 2012 to 2014 will in place upgrading upgrade the reporting databases as well?” I don’t think they get left behind.

Tara Kizer: Yeah, they are part of the upgrade but you need to also take care of reporting services. Make sure that the version that you are running there is going to work as these databases do get upgraded. I don’t even like having report server databases on my production instance. I want it on a separate instance away from my OLTP stuff. Maybe it will have a lesser license server. And, I don’t even like in place upgrades, forget that. For production, heck no.

Erik Darling: No. Let’s answer your question with a list of stuff Tara doesn’t like: in place upgrades, reporting databases on the same server, and even reporting services as a service on the same server.

Tara Kizer: Oh, heck no. Yeah, that never goes on my server. I might [inaudible 00:05:41] the RS databases but definitely not the service.

Erik Darling: So, list of things for you to do this weekend. Move all your stuff to a different server then do your upgrade, but not in place.

Richie Rump: No, never.

Tara Kizer: When I say in place, I mean the actual upgrade wizard where you lose your instance, you know the entire instance is upgraded. If this is like newer hardware or maybe just getting a newer hardware and you want to just install a new instance side by side on the same server, I have done that for sure. A side by side upgrade does not have to be on another server. In the days with virtual machines, everyone is running virtual machines these days, what’s the big deal about setting up a new VM temporarily?

Erik Darling: Yeah, so, to recap don’t do in place upgrades. Don’t keep your reporting service database on the same server, and don’t keep reporting services as a service on the same server. You are going to want to separate that stuff out and not break your entire server and reporting services.

Richie Rump: And run everything in the cloud, he forgot that one.

Tara Kizer: There you go.

Erik Darling: It’s not your money, why not?

Richie Rump: It’s not your server either.

Erik Darling: No, it’s true. Well, you never know, it might be. I might have branded it.

Should I compress everything by default?

Erik Darling: John asks, “I read on a blog post by Greg Low that row compression is recommended over none across the board. Would you agree with this or in other words when would none compression be good?”

Tara Kizer: None compression, that’s just it disabled, right? That’s the default.

Erik Darling: Well you squished your nones down so they all fit in a row.

Tara Kizer: I like compression. I did extensive testing on compression and we used that one script that’s out there on the internet somewhere that it can test whether or not if you should do page versus row compression on each index. So maybe you don’t only have row compressed indexes, you also do page. Anyways, I liked it for performance. I liked it for storage. I didn’t experience that much higher CPU utilization. I didn’t have servers that were bottlenecked. Honestly, I think I would probably move forward with it out the door. In new databases, I’d enable it. The only issue recently is I had a client that had Enterprise Edition, and that was the only Enterprise Edition feature they were using. They had a server with tons and tons of cores. It’s real expensive to license Enterprise Edition. So, this is an Enterprise Edition feature so be sure that—those licenses are expensive.

Erik Darling: Yeah, I mean as far as I know, the only time I would say don’t compress, or like why I wouldn’t just have it as the default, is because not everything compresses well.

Tara Kizer: Yeah, that’s true.

Erik Darling: So, it doesn’t do you any good to turn it on and then not have it actually compress anything. Then it’s just kind of a waste. But like Tara said, there are scripts out there that can estimate compression, you know, you can run that sp estimate compression stored procedure on all your tables and figure out which ones and which indexes would benefit most from that. The other thing to be a little bit careful of is if you have a really high stress OLTP workload sometimes, I’ve found compression can bog down modifications and stuff. Select and everything would run really great.

Do the deadlock trace flags add overhead?

Erik Darling: All right, Mike asks, “I have intermittent deadlocks occurring on a SQL Server 2012 database that I am not able to reproduce in test. What is the performance impact of enabling the relevant trace flags and leaving them on to log the deadlock and to SQL until I have got a few examples of it analyze?”

Tara Kizer: There is no performance concern with enabling those trace flags. Some people just enable it for all new instillations that they do. It’s fine, it just clogs up the error log and it’s really hard to read that information in the error log anyway, but at least you have it. Reproducing it in a test environment is hard because you are not simulating the load and it’s really the production load that causes it. Blocking in a system is normal and the longer those blocks last, the more likelihood of having deadlocks occur and your production hardware is probably better and it loads probably a lot higher. But yeah, trace flags are fine in the error log.

Erik Darling: Turning on those trace flags is fine in that you can get error log information from that. But you’re on 2012 and extended events are a much better thing than they are on at least 2008 R2 and 2008. So, on 2012 well, going back further, but at least on 2012, you do have the system health extended event running constantly, which will have some of your deadlocks in there. It’s not going to have all of them because I believe it hangs out in the buffer ring. So, it’s not going to have all your deadlocks going back historically. But you could probably find some recent ones in there. I am sure if you Google around for “system health session deadlocks” you could find a—there is a script out there from a Microsoft guy that does a pretty good job of parsing things. You could also set up a dedicated extended event session to run and sample and collect deadlocks, because that will only fire off when deadlocks are actually happening. So, there is not a lot of overhead to just having it on waiting for a deadlock to occur. You get the XML, you get the picture graph, where it’s much easier to decipher exactly what deadlocked and why. So, the trace flags are fine and there’s no overhead to dumping stuff in the error log, but I think since you’re on a slightly more modern version of SQL Server, you do have some better options available to you.

I have this really long question…

Erik Darling: Darlene, post that on dba.stackexchange.com. That’s way too little information to figure out what’s going on with your view. We would need to see some more stuff than that.

Richie Rump: Here is a little hint about dba.stackexchange.com. We see it, we kind of see everything that kind of goes through there, like all of us are trolling on it. So, there is good chance that one us may actually answer your questions if you put it on there. So, if you don’t pay attention to dba.stackexchange.com don’t worry because we do.

Tara Kizer: Erik gets all the cool query questions. I take the AG questions.

Erik Darling: Yeah, we split things up pretty fairly. No turf wars with that.

Richie Rump: I just wish you would not go into our Slack [inaudible 00:12:34].

Erik Darling: We could have a separate channel for Slack Stack questions—Slack channel for Stack questions.

Richie Rump: We wouldn’t pay attention to it as much if it was a separate channel.

Erik Darling: I would.

Richie Rump: It’s fine where it is.

Erik Darling: I’m horrible. As soon as I see that little green dot [inaudible 00:12:52] go up over the hash sign, I am like, “Who said what?” Who’s in Slack?

Why do I see a residual predicate in my plan?

Erik Darling: Gordon asks an interesting question. “Query filter where ID is not null and code not equal to zero results in residual predicate for the code not equal to zero part of the filter, even though code is an included column of the index where ID is a key column. Why the residual predicate?” Well, because it’s a range. You’re going to have a residual predicate there anyway. You can only seek on one predicate at a time. If you look at any query that has multiple predicates, you see seek keys and then residuals. So, also non-key columns can’t be seek keys because the index isn’t ordered by them. There is no interesting sorting or anything by non-key columns. Included columns just hang out at the very leaf level of the index, very bottom level of the index tree, pyramid, however you want to call it. The data lives there but there is no statistics and there is no ordering and there is no particularly good information about them. It just kind of hangs out there waiting for a select. So, if you wanted to potentially get seek on multiple, you could promote it to a key column or make it a filtered index.

Richie Rump: What I think is funny is that you had leaf level as the bottom level and then some people we know put the leaves on top because the trees on top.

Erik Darling: I always picture it at the bottom.

Richie Rump: Well because that’s where all the documentation goes, right? The documentation, you start here and then you branch off into the leaves at the bottom. If it’s supposed to be a tree, the leaves are at the top.

Erik Darling: Yeah, I know like it’s fundamentally wrong and just like it’s a concept when I explained it at the bottom it’s wrong, but I just can’t… It’s hard for me to be like, “Oh the root of the index is down here and then we add stuff up here.” Get away from me. That’s where mosquitoes come from, I don’t want that.

Can I apply SQL 2008R2 SP3 to SharePoint 2007?

Erik Darling: [Inaudible 00:15:04], I don’t know.

Tara Kizer: Yeah.

Erik Darling: I have no idea what you’re asking me.

Tara Kizer: Yeah, you’re going to have to Google for that I think.

Erik Darling: He is asking if we can apply SP3 and SE to SQL Server 2008 R2… I have no idea how SharePoint 2007 interacts with SQL. I don’t know. Upgrade SharePoint. Why is it on 2007? How old is that now? Ten years? That’s ten years. You have a ten-year-old kid.

Richie Rump: …SharePoint thing I don’t understand it at all. Where documents go to die: SharePoint.

Erik Darling: Colton asks a funny question about index maintenance. Sorry, Colton, I don’t know if that’s a male or female name so I am going to “they” you. They ask, “We recently adjusted our index maintenance to only rebuild on weekends and update stats during the week. This improved our nightly jobs from five to six hours to two and a half to three hours. If stats only helps choose the maintenance plan, does this imply we just had really bad execution plans? I am trying to understand the reason behind the huge performance increase.” So, are you talking about your maintenance plans not running?

Tara Kizer: I think they meant to say execution plan right there.

Erik Darling: Oh, okay.

Tara Kizer: … nightly jobs, maybe purging data, who knows what. They were taking five to six hours, now they are two to three hours. Yeah, maybe out of date statistics, maybe bad execution plans.

Erik Darling: Yeah, could be all sorts of stuff in there. Tara’s got that one. It could be either one of those things. There you have it. I don’t even have to say anything now.

Which backup vendors should I use?

Erik Darling: Kyle asks, “We plan on replacing Idera SQL Safe with Redgate SQL backup. Are there any horror stories? Do you have any tips from past with replacing SQL backup vendors?” God.

Tara Kizer: I haven’t used Redgate SQL backup since SQL Server 2005, ever since compression got put into engine. I know that those tools offer additional features: better log shipping, better GUI, better historical information. All sorts of better stuff, but I didn’t need those features. I just cared about compression, backup compression. So, I ditched them.

Erik Darling: Yeah, I wish I could say that I had replaced the backup product at my last job because I hated it. I’m not going to say what it was.

Tara Kiser: Oh, Commvault?

Erik Darling: I didn’t say anything, but I wish I could say that we had replaced it but, rough.

Tara Kizer: We only use the Commvault to copy our files, our SQL Server backup files. Copy them to the cloud.

Erik Darling: Yeah, Redgate I am sure it’s fine. I don’t know of any misaligned abilities between the two, so I am sure it will be all right just make sure that you can restore your old backups with the new tool I guess.

Have you seen anyone using Always Encrypted?

Erik Darling: [00:18:22 Gama], who left, it’s okay I am still going to ask the question. I’m still going to put the question out there but he left. Left us all alone.

Richie Rump: What’s up G? What’s going down, man? Come on now. You used to love me.

Erik Darling: What’s going on, fella? “If any, what has been your experience with clients implementing always encrypted in SQL server 2016?” I ain’t seen it.

Tara Kizer: I have only had one client that was using SQL 2016 so far.

Erik Darling: Yeah, no one has wanted to talk to us about always encrypted yet. So, sorry about that. If you are reading this later in the transcription or watching the video, sorry, no experience on that yet.

Tara Kizer: Most of our clients are on 2008 R2 and 2012 I would say.

Erik Darling: Yeah, yeah, that’s pretty common numbers.

Why did they take away my database access?

Erik Darling: Kevin asks, “In SQL Server 2008 R2 could a person’s DB owner access to a database suddenly disappear?” Bad news, Kevin, you have been fired.

Tara Kizer: You’ve got a DBA or sysadmin messing with you.

Erik Darling: Yeah.

Tara Kizer: Unless you run into some weird bug, but I have never heard of a bug like that, but somebody has removed your access.

Erik Darling: If Kevin leaves early, we know he has got called into HR.

Richie Rump: Yeah, you need to go talk to your boss like right now.

Erik Darling: Like, “Hey man, I was watching this webinar and they told me I got fired, I was working hard and…”

Are there any gotchas with VM backups for SQL Server?

Erik Darling: Graham asks a kind of interesting question, “Our infrastructure team is evaluating Veeam as a replication and backup solution. They want me to evaluate its SQL backup capabilities. While the product is fancy, it essentially appears to be another GUI for backups. What are your thoughts on some third-party backup providers?” Not necessarily Veeam, but just in general.

Tara Kizer: Yeah, I have been at these large organizations that every now and then they have these brilliant ideas to test out this new backup solution. We’ve tested all sorts of them. The answer always come down to, I am a SQL Server DBA, I want SQL Server backups. When I need to do a restore, I want to be able to have access to those files immediately and start that restore. If it’s on tape or it’s some tool I am not familiar with, it’s going to take me longer to do my job and a restore in production is going to be critical. Veeam, I’m fine with using these third-party tools after the SQL Server backup file is created. Then use those tools to copy the files, copy the Windows files, not the actual SQL Servers backups.

Erik Darling: I’m totally with Tara on the SQL native backups. Those really are, unless you have, you know, massive terabyte plus databases where SAN Snapshots make way more sense, SQL native backups are still the bee’s knees. When people are trying out these backup tools, I don’t think that the backup capability should be what they judge. I think the restore capability should be what they judge. A lot of them have really cool stuff with the log reader, object level restore, other things like that. So while they may be perfectly capable of taking a backup, what I want to know is how they restore that backup. I want know if I can restore that backup if that tool isn’t available. One major grip that I have had with a prior backup tool that will still remain nameless by me, is that when it takes a backup it creates a temporary backup device, which is a GUID. So my backup log would say backed up to device equals some GUID. I couldn’t script a restore from that because it’d just go [inaudible 00:21:55] those devices after it was finished. So when it went to restore stuff, it had to do some internal proprietary process. One time the server that that tool was on went down, no fault of mine because I wasn’t the admin on that server. But that server went down and someone needed to restore, and I said, “Sorry, I can’t restore that for you.” So, one thing that you do want to make sure of it is that you can restore without that tool and if that tool offers anything else as far as restores go. So there’s my blabbering answer on that.

Does an unused database add load?

Erik Darling: Chris asks, “Does an unused database add any noticeable load to a server?”

Tara Kizer: No.

Erik Darling: No, thankfully. I mean, I guess they could run out of disk space a little faster if it’s big but aside from that, I don’t think it’s going to do anything. If you are running CHECKDB or other maintenance on it, then sure it’s going to suck up some space.

Richie Rump: But then it is used, right?

Erik Darling: Then it’s used, not necessarily by people, but you know someone…

Richie Rump: Even if it has pages in memory, then it’s being used for some reason.

Erik Darling: That is true. So, I would temper your definition of unused a little bit. Unused by people or is it unused period? If it’s unused period make it read only, something like that. Drop it, just drop it for fun, you can take Kevin’s job who just lost his DB owner access.

Richie Rump: Don’t be hard, man. Come on now.

Erik Darling: I’m just messing with him.

Should I use the GUI for managing lots of users?

Erik Darling: Tom says he went through an exercise “where we needed to map out and duplicate various logins and user rights using Management Studio. Was tedious to say the least as each DB security had to be mapped one at a time,” what’s up with that? This is why you learn how to script stuff. Tom, my friend, this is why scripting is your friend. This why the GUI is not your friend, there’s a lot of right clicking involved in that GUI. But I will give you one tip with the GUI, something that not many people know about. So, when you click on certain note items in the GUI, if you go up to the view menu in Management Studio, you can look down a couple notches and click on “view object explorer details” and that will often bring up a new pane in Management Studio where you can highlight multiple items, right click, and script them. So give that a shot next time you need to script out a whole bunch of stuff. I know it works for agent jobs and other things like that, like indexes and whatever, so maybe it will work for you in this case.

Richie Rump: Hey, if I could use scripting for things like backups and restores, you could do it for logins, man.

Erik Darling: If Richie can do it, anyone can do it, I think that’s the message.

Richie Rump: Senior DBA, that’s what I got.

How do I import this 2GB text file?

Erik Darling: Oh boy, Ron asks a good ETL question. Richie, we should thank him for this. We should put some maple syrup on this question, it’s delicious.

Richie Rump: The answer is Python.

Erik Darling: He is right. “I have a 2-gigabyte text file that is fixed length without row terminators and no way to get the source changed. How can I import this file into a table?”

Richie Rump: Without row terminators?

Tara Kizer: Yeah, I don’t even understand the format then. Without row terminators, it’s just going to be row after row all together in one row.

Richie Rump: Is it a government file? That would be my question, because I’ve seen some crappy government file stuff, but you have always had your \n or your \t\n type stuff, right?

Erik Darling: Yeah, or \n\r.

Richie Rump: I have got a new row. So is this I need to read every character at that point? And then after 50 characters I have a new row? I mean if that’s the thing? And if that is the thing, then I am probably going to script that out using PowerShell or some C# or something so I can get it into a format that’s a little easier to kind of read, you kind of do that preprocessing step. Because you don’t want to work your ETL process around that type of thing, it’s kind of dirty. So, normalize the format, even if you want to chunk it out into smaller one meg files or something that’s a little more digestible through your ETL, go ahead and do that too but that’s what I would do, I’d change it up a little bit.

Erik Darling: I used to work for a market research company and I used to get horrible data from ten different sources that had to get munged together into one table and cleaned and all sorts of other stuff. So my best advice is to learn a language that allows you to manipulate text files well so that you can clean them before trying to load them. There is some stuff that is not good to do in SQL and there is some stuff no matter how awesome your bulk insert [inaudible 00:27:31] is with format files, that you just can’t get around. There are all sorts of situations with like ragged left and ragged right and ragged all around.

Richie Rump: He came back and said it’s government data. It’s postal service data. So yeah, how do I know this? I don’t know, I feel your pain, Ron. You and me, we are blood brothers now. We felt the same pain. Yeah, you got to do that little bit of preprocessing. Get your own format, and then start processing in a sane manner, as opposed to some of the insane ways our government data is in.

Erik Darling: Off the top of my head advice, I would open it up in a hex editor or some other text editor that shows you the character mappings for everything. So that if you can see what’s actually at the end of a line, because there might be just some weird character at the end of a line that you could do like a quick regex replace with a row terminator that’s readable by SQL or something else. If it is truly fixed width and every line is the same length, then you could fairly easily write a regex thing that just every x characters, you insert this character. So, that would be my advice on that. I’m sorry, I detoured for little while on that. It has been so long since it was like a funny ETL question that I could answer. I was like, “I used to do this crap, it’s horrible, I’m sorry.”

Richie Rump: Listen Erik, if you use regex as your solution, you now have two problems, okay? Just remember that.

Erik Darling: I enjoyed the regex challenge. I wouldn’t say that I was good at it or that I passed with flying colors, but I enjoyed the challenge of writing regex. I thought it was fun to see it work.

Richie Rump: And ladies and gentlemen, every time his regex failed, he got a new tattoo. That’s how that worked.

Erik Darling: I actually just got a line and a tattoo.

Richie Rump: And they all just kind of blended together, you see how that works.

Erik Darling: It’s pretty sad actually. Actually, all my tattoos cover up regex code that I was trying to remember. Like, “Oh, yes, look back and look forward.”

Will you be at PASS Summit talking about Azure?

Erik Darling: Karen is asking if we will be at PASS Summit this year, “and do you know if there are any sessions related to SQL Azure?” They have no session list workshop details on the website. Well, they haven’t—I’m not sure if they even have the call for speakers yet but you can bet your bottom dollar Microsoft is going to be shoving Azure at you.

Tara Kizer: It’s going to be months before you see the schedule go up there, September maybe. We don’t even know who the speakers are.

Richie Rump: Listen, there will be so much Azure stuff that you couldn’t walk anywhere without running into Azure [inaudible 00:30:16]. Don’t worry about the Azure stuff, it will probably be there.

Tara Kizer: Who is going? Are you going, Erik?

Erik Darling: Yeah, I’m going but only because I’m bringing everyone with me. So I don’t have to miss the Halloween thing which is incredibly rude of them.

Richie Rump: So, what about you Tara, are you going?

Tara Kizer: No, I am absolutely not going. I’m protesting it actually. Having it over Halloween is just, it’s just not fair to the—I mean so many people that are DBAs and developers, a lot of them are parents with kids that are under say 13 that, you know, Halloween is a big deal. It’s fun. You get pictures and it’s a moment that you don’t want to miss. So, no, I’m not going.

Richie Rump: I’m considering it, I don’t know, I’m back and forth depending on the day. So, we will see.

Erik Darling: If Richie doesn’t go, he loses like some board game title, so I don’t know..

Richie Rump: Well, no, like half of the people I play with aren’t going to go either, so…

Erik Darling: True.

Richie Rump: So, that’s kind of the way that works. You’ve got to think that they must have had saved a ton just on the location and the week. Then like, “Oh no, DBAs, we are gung ho¸ we will go through anything” and that’s just…

Erik Darling: Also you’re going to lose a ton on all the people not showing up because you start your session on Halloween and no one wants to see anyone from Microsoft in a sexy nurse outfit. See Conor Cunningham in a sexy nurse outfit, not going to work for me.

Richie Rump: Well, talk to me more about [inaudible 00:25:06] again. You have my interest now, sir, you have my interest.

Should we use database mail in user applications?

Erik Darling: Oh god, it’s 12:45, but we started a little late, so I’m going to ask one more question if I can find a good one.

Tara Kizer: I think this question—I like this question…

Richie Rump: Oh, go ahead then, you ask that one.

Tara Kizer: “Do you believe database mail should be used by user applications to send their emails in?” No, absolutely not. Applications have access to so much better tools than database mail. Just because a feature exists in SQL Server does not mean you should use it for the application. Database mail is for DBAs, sysadmins, to get alerts. It’s for, to do things within SQL Server, not for the application.

Richie Rump: Yeah, and there’s a lot of great services out there that utilize email really really well, I mean the SES in Amazon is great. There’s a whole bunch of other stuff out there that handles emails so much better than freaking SQL Server.

Tara Kizer: Just because you can, doesn’t mean you should.

Erik Darling: Don’t spend thousands of dollars per core on an email server, that’s goofy. All right, we are going to leave, we are going to end on that high note. Thank you all for showing up. We’ll see you next week, maybe Brent will be there, I don’t know.

Tara Kizer: Maybe.

Erik Darling: Maybe, we’ll see. Thank you guys, catch you later.

Wanna join us at the next Office Hours to ask a question? Register here.

This Is Your Backup Schedule On Drugs

Last Updated March 25, 2017

Erik Darling

SQL Server

I’ve seen things

Terrible things. I mean, not just working here (which, as I write this, I’m three days away from the two year mark, fingers crossed!), but also out there on the internet.

Once you leave the safe confines of Brent Ozar Unlimited, you’re subject to the ravages of space. Strangers. Radiation. Strangiation. Your bones will scream. It’s bad!

So let’s talk about backup strategies

First and foremost, they need to match your RPO and RTO goals. So, go ahead and get those in writing first. The ‘in writing’ part is important, so no one can turn around and say GOT YOUR NOSE when things get wonky.

Backup schedules are like carefully woven spiderwebs, engineered to be safe, stable, and catch any potential bugs. We all know what happens to spiderwebs when our eight legged friends are exposed to less-than-legal substances.

Let’s play a game

I’m going to list out some of the worst backup schedules I’ve seen, and you get to tell me what you think caused such a lapse of cognizant reality. Please try to limit it to what’s in the medicine cabinet or liquor cabinet, or at least what’s under the kitchen sink.

Oh, forget it. I know you people, and I don’t wanna know what’s in your medicine cabinets.

Assume all the recovery models are Full.

Web 1

Full backups every four hours, log shrink job every morning that failed

Web 2

VSS snaps every 15 minutes that froze I/O for 30 seconds

Web 3

Full backups every night, log backups every 30 minutes that stop between 5pm and 8am

Web 4

Set database to single user, take a full, diff, and log, set database to multiuser (daily)

Web 5

Set database to Simple, take a full backup, shrink the log, set database to Full (weekly)

This place has everything

If any of these apply to you, now’s the time to start fixing them. If not, put down the Endust filled paper bag and let me know what you think was on someone’s mind when they designed these.

Thanks for reading!

We’re Changing Our Online Training – Here’s Why.

Last Updated March 25, 2017

Brent Ozar

Company News

“Who replaced my SQL Server book with a cookbook?”

When we launched our self-paced online training classes, we packaged them as focused courses that addressed a single pain point like:

How to Tune Indexes and Speed Up SQL Server
How to Read Query Plans
DBA’s Guide to High Availability and Disaster Recovery
Virtualization, SANs, and Hardware for DBAs

You paid a fixed price for the course you wanted, and you got access to it for 18 months. We figured you could learn a topic well across the span of a year and a half, and then pick which topic you wanted to tackle next.

You learned stuff about SQL Server, and we learned stuff about you.

You want to learn everything. After we released the Everything Bundle, it quickly turned into our biggest seller by far, accounting for over 80% of our sales. When we surveyed Everything Bundle buyers, they wrote comments like, “I love your style of teaching. Anything you want to teach me, I want to learn.” Alrighty then!

48% of you don’t want to pay up front. About half of the buyers said they want to pay a big chunk up front so that they only have to deal with their corporate expense process once. However, a lot of you would rather pay a monthly subscription fee.

You’re not good at managing your time. Let’s be honest, dear reader: every week, we get emails that say the same thing. “Can I get just another month or two? I’ve been really busy at work, and I didn’t watch a lot of the stuff I wanted to watch.” Even if you bought 18 months up front, you want to be able to add a few more months – especially when we’re adding new material.

I heard you loud and clear.

So this week, we’re taking our sales offline to make some changes. I’m excited to tell you all about it next week when we relaunch.

5 Signs Your SQL Server Isn’t Wearing Pants

Last Updated March 24, 2017

Brent Ozar

CHECKDB and Corruption, SQL Server

We see a lot of SQL Servers.

During our sales process, prospective customers run sp_Blitz on their SQL Server, and we talk through the results. I see a lot of horrifying stuff.

Over time, I’ve built up a pretty good sense of signs that a SQL Server is running around loose in the data center, juggling chainsaws, drinking Keystone Beer, not wearing any pants.

5. Some of the databases haven’t been backed up. sp_Blitz alerts you if there’s no backups in the last two weeks, which is bad. You would probably think if that’s the case, none of the databases have been backed up – but that’s almost never true. What usually happens is that someone set up backups once, picked the specific databases they wanted backed up, and then never looked at that list again. They added more databases, but they didn’t configure backups to match.

4. CPU schedulers and memory offline. There are two install ISOs for SQL Server Enterprise Edition with very confusing names (one just says Enterprise Edition, and the other one says Core-Based.) One of them only supports 20 physical cores. Way too often, I see SQL Servers where 1/4-1/2 of the CPU cores and memory simply aren’t even turned on.

My reaction to your lack of pants. Photo by Yu Tang, no relation to the clan.

3. Build of SQL Server with known corruption & security bugs. This bug and this bug were fixed years ago, and yet many, many database servers out there are still running known dangerous builds.

2. No CHECKDB, ever. sp_Blitz reports on the last time DBCC CHECKDB finished successfully. I’m constantly amazed that people have never even tried to run CHECKDB on any of their databases, period. At the same time, when I ask how much data they want to lose, the answer is, “None.”

1. Databases in full recovery with no transaction log backups. A popular myth (and I used to believe it too, once!) is that full backups clear out the transaction log. Folks put a database in full recovery model, do full backups, and think they’re fine because those two “full” terms match up. Meanwhile, their database is 50GB, and their log file has grown to 500GB.

When I see this stuff, I buckle the belt on my office chair, get ready for a bumpy ride, and ask the client not to turn on their webcam (or at least not point it down.)

Do us both a favor. Run sp_Blitz on your servers today, and put some pants on them.

From The Mailbag: DBCC CHECKDB And Read Only Databases

Last Updated March 23, 2017

Erik Darling

CHECKDB and Corruption, SQL Server

We got an email!!!

It was weird. It was from an Australian, and it was right side up. I could hardly believe it. Normally we encourage people to ask us questions during Office Hours, or over on Stack Exchange. That way lots of people can benefit from the answer.

But since this reader is on the other side of the globe, they only get to listen to Office Hours two days before it’s even recorded. Time is weird like that.

Titular Line

So, can you run DBCC CHECKDB on a read only database? Should you run DBCC CHECKDB on a read only database?

tl;dr: YES AND YES!

Here’s why:
Many forms of corruption that I’ve seen have come from storage. Sure, there have been bugs that were to blame, but yeah. Most of the time, it’s the storage going all yucky.

Granted, if a read only database becomes corrupt, it’s fairly easy to just restore the last good full backup. Unless it’s a really big database. Then it could take some time to restore. That’s RTO — you won’t lose any data here — but it’s generally more courteous to find corruption on nights and weekends rather than let users find it during the day when they try to run a query.

Another reason you may want to do it, is CHECKUMS. If you’re a good little SQL owner, you have alerts set up to email you when some forms of corruption are detected. These alerts rely on verifying page checksums as they’re read from disk. If you made your database read only before a page got a checksum, SQL may never know. Checksums only get applied when pages are written to disk. You can imagine that some rarely-used tables may not have had that happen in a while, and may just be sitting on some storage blocks that went bad.

It’s also entirely possible that the corruption isn’t just limited to one database 😀

It could be all of them 😀

Are there any gotchas?

Not really, but you should be aware of a couple things. While CHECKDB will run fine, it won’t update flags inside the database to tell you it did.

That means sp_Blitz may warn you that CHECKDB is out of date. Realistically, that’s not going to change. You could be using Ola Hallengren’s scripts, you could be using Minion Ware, you could be using Maintenance Plans, or some other 3rd party tool or script. You could even be running them from another server that just gets used to run Agent jobs across your environment. We couldn’t possibly, sanely, account for all the possibilities. There are some real wackadoos out there.

Fortunately, sp_Blitz lets you skip certain checks, and even skip certain checks for certain databases. If it bugs you, look into that. If not, just make sure you check the logging of whatever you use to run CHECKDB to make sure it’s happening. For instance, Ola recently added a pargumeter called Updateability to let you choose to run or not run certain tasks on read only databases. The default is both, but who knows? Like I said, real wackadoos out there.

Thanks for reading!

Get Free SQL Stuff

"*" indicates required fields

X/Twitter

This field is for validation purposes and should be left unchanged.

Name*

First Last

Email*

Things I want*

First Responder Kit (Scripts and Guides)

Monday Links (SQL & tech news)

Blog posts (2-4 per week)

Blog

Fixing it with the claw hammer: MIN_GRANT_PERCENT

Fixing it with the scalpel: CREATE INDEX

While working on a demo

Logging Tables

Logjammin’

What are the pros and cons of Platform-as-a-Service?

What resources get used when you ALTER TABLE?

Why is parameter sniffing an issue in only some of my procs?

What’s your experience with Azure?

Any good references on columnstore indexes?

How should I train someone on Always On Availability Groups?

Can I add an AG listener online?

Do you have a server health checklist?

Do sleeping connections cause negative impacts?

Should I use fill factor 100% or something else?

How do I talk people out of including every field in an index?

sp_Blitz Improvements

sp_BlitzCache Improvements

sp_BlitzFirst Improvements

sp_BlitzIndex Improvements

sp_BlitzWho Improvements

sp_DatabaseRestore Improvements

If I configure a VM with local SSDs, and the host fails…

Would you automate log file truncation?

Which index should I rebuild next?

Have you had issues with sp_rename?

Ramblings about unit tests and feedback

If a SQL Server gets more queries, does it get slower?

Should I use columnstore indexes if I have constant inserts?

How much memory is SQL Server using?

How do you debug parameter sniffing issues?

What causes a query plan to change?

What’s the best way to find unused indexes?

What names should Erik and Richie use next week?

When will I see a GroupBy.org presentation from you?

Should I shrink my database if…

What headsets and microphones do you guys use?

What’s your favorite learning resource for internals?

What are you learning next?

We’ve all been there

Various defaults

More interesting

You’ve come a long way, baby

Insertresting

This is a post because it surprised me

As the title suggests, this also applies to views

Are there any differences?

Now with less nolock

But there was a catch.

Here’s how you can make these same decisions.

Can I performance tune statistics updates?

Do in-place upgrades upgrade everything?

Should I compress everything by default?

Do the deadlock trace flags add overhead?

I have this really long question…

Why do I see a residual predicate in my plan?

Can I apply SQL 2008R2 SP3 to SharePoint 2007?

Which backup vendors should I use?

Have you seen anyone using Always Encrypted?

Why did they take away my database access?

Are there any gotchas with VM backups for SQL Server?

Does an unused database add load?

Should I use the GUI for managing lots of users?

How do I import this 2GB text file?

Will you be at PASS Summit talking about Azure?

Should we use database mail in user applications?

I’ve seen things

So let’s talk about backup strategies

Let’s play a game

Web 1

Web 2

Web 3

Web 4

Web 5

This place has everything

We got an email!!!

Titular Line

Are there any gotchas?

Hi! I’m Brent Ozar.