This week, Erik and Richie discuss truncating log files, reindexing, using sp_rename, query tuning, columnstore indexes, debugging parameter sniffing issues, query processing internals, and troubleshooting query performance issues.
If you prefer to listen to the audio:
Enjoy the Podcast?
Office Hours Webcast – 2017-03-26
If I configure a VM with local SSDs, and the host fails…
Erik Darling: Alright, James F – you have a question about encryption, I’m not the best on that stuff admittedly, I’m not much on security and certificates and what not. I would post that up on dba.stackexchange.com. James F asks another question, though, a follow-up – well, not a follow-up, a completely different question -“There is an article on your site that talks about tempdb on local host SSD with a SAN, how does that work with host failover?” Well, geez, that’s not really for VMs. That’s not a good VM idea. That is a good failover cluster idea. So if you’re on a failover cluster you can use local SSDs and when you know, tempdb fails over and is restarted, it’ll just find the new thing and be happy. With VM hosts is completely weird and different and I wouldn’t do it.
Would you automate log file truncation?
Erik Darling: Let’s see here, “Would you automate log file truncation and update growth for next extent in initial…?” – no. No, leave it alone. If you have to ask, leave it alone. I don’t like truncating log files.
Richie Rump: Why is that?
Erik Darling: Well, because they have stuff to grow again. When they grow, they don’t get the instant file initial – it’s an instant file initialize, you can tell I’m on cold medicine – Initialilazize…. So they have to grow and it takes time because they have to zero out all that space, so when you back up a transaction log, the space inside it is truncated. It’s the worst word that they could have possibly used for it because everyone pictures their transaction log shrinking to a magical size when they say your log file is truncated when you back it up, but that’s not it. So internally, space can be reused after you back up a log file. I don’t like truncating log files because they look – when they grow it can be painful. Leave them alone, and as they’re growing, database activity pauses while they grow because nothing else can get logged in there. It’s like oh hang on, we’ve got to grow the file 10GB, so you’re rebuilding a dumb index. Have fun. Alright.
Which index should I rebuild next?
Erik Darling: J. H asks, “Any good methods of anticipating which index to tend to ahead of time before a weekly re-index job and accommodating their needed spacings?” Jeez, well, what indexes usually get rebuilt? I mean, if you’re using Ola Hallengren that all gets logged to the command log table. If you’re using Minion, assume they have some sort of logging apparatus on there, so I would just look at which ones get rebuilt in their size. I’m not sure where you want me to go with that, if you meant something else please let me know.
Have you had issues with sp_rename?
Erik Darling: Joshua asks, “Have you used sp_rename for tables much and have you had issues using it? Just looking for experience points here.” Well, yes actually. I have a fairly old blog post, it was called the sp_rename follies and I got beat up in the comments about not using the GUI or something. Let me see if I can find it.
Richie Rump: The what? What is that? How do you spell that?
Erik Darling: It’s a French word and that means what’s on your screen, dummy.
Richie Rump: Oh wait, Goonies? That’s what my finger movie back in the day man, you should have said that in the first place, that’s amazing.
Erik Darling: It’s a good movie. So let me dump this into a – good enough for the chat window here for Joshua. So anyway, I had this thing with sp_rename where I would mess up and I would either put too many – I would put a DBO in where I didn’t need to and it would rename the table dbo.dbo, and it was dumb and I couldn’t rename it because something else had a lock on the table all of a sudden. I think it was probably like Intellisense looking at this table like did this dummy really do that – so anyway, that was my funny sp_rename folly, when I used it, I put in an extra DBO and I had a dbo.dbo.dbo.
Richie Rump: Yes, so I think the extent for my sp_rename stuff for tables was back in the day, I mean like, mid-2000s, maybe even earlier…
Erik Darling: Mid-2000s?
Richie Rump: Yes, we were doing like poor man’s replication, you know, just hopping data over because we didn’t want to deal with a lot of replication overhead, so as a reporting system we would just copy it over as a table name and then we would do a rename – sp_rename on the table that was in the database and then we would rename the new table actually to whatever the original table name was and do it that way. So that’s old school way of doing things.
Erik Darling: Yes, that’s almost like the weird ETL Schema switch thing.
Richie Rump: Yes.
Erik Darling: Let’s flop a new table in.
Richie Rump: Yes.
Erik Darling: I’ve always like tricks like that. I think they’re neat, but then I always like expect there to be some horrible drawback. Like I remember like – I think I did like Alter Schema switch or I did sp_rename and I remember like waking up in the middle of the night once thinking, but what if like statistics about the old table are caches? I was just like, but wait, what if this horrible thing happens and like I woke up the next morning after like finally drinking a bottle of whiskey and falling back asleep and I was just like, waiting for that email like whatever you did ruined the server, you are fired.
Richie Rump: So is that like an honest thing with DBAs? Waking up in cold sweat at three o clock in the morning like oh my gosh.
Erik Darling: Yes, like every time I would roll out a new job or a new thing to do, like at my last job where it was a relatively shop and I would do this thing where I would script out BlitzIndex to look for missing indexes because with relativity, you can add missing index requests and they don’t mind. They’re like, cool, you’re using our application in a different way, we respect that. So they let you as a DBA, add into indexes kind of to your heart’s content. You can do all sorts of stuff in there. So I would set up these things that would run sp_BlitzIndex and gather missing index requests from across all of the databases. At the time, BlitzIndex didn’t have to get all databases and it had to be run individually and dumped to a table or something. That was annoying, but I did it, it worked, then I would review the missing index requests and I would set up a script that would go through and add missing indexes. I remember when I was doing that, I was petrified and I was just like, every time that ran I would stay up and watch it and just like shake. Like, please don’t mess, no red text, no red text.
Richie Rump: You were Gloria Gaynor, so first you were afraid, you were petrified.
Erik Darling: I was, and I still am. I still wait to get a phone call about one of those scripts, like, hey man, remember that…
Richie Rump: No, no I don’t, I don’t at all, click…
Erik Darling: Nope, never heard of it, neither has my NDA, never heard of it, goodbye.
Ramblings about unit tests and feedback
Richie Rump: See, that’s the one thing I’m pretty happy what happened in the development space over the past ten years or so, is that now we have become so test focused and the test focus has absolutely been on repeating those tests and being able to write those tests and then run them at any given point, so now all the code we write at least here, we have unit test for and…
Erik Darling: Let’s tamper that statement. All the code that Richie writes has unit tests in stuff. All the code that we write has an AWS lab that may or may not meet our needs and you, the end user of our free scripts, are often a guinea pig for us.
Richie Rump: Oddly enough, that’s true, but we get your feedback and we integrate it into the product, so thank you very much.
Erik Darling: We will fix things based on that.
Richie Rump: Yes.
Erik Darling: I do like getting bug reports and stuff because then I have something to do when I’m bored, other than blog and annoy Richie with blog posts and the word blog.
Richie Rump: Yes, I don’t get bored around here. I don’t know what you’re talking about.
Erik Darling: No, I get bored.
Richie Rump: Yes, I know you get bored, but I don’t get bored. There’s so much stuff to do. I’m like why don’t you just come and help me Erik, and he’s like no, another blog post…
Erik Darling: Well, my problem is I only know Java, so I can’t help you.
Richie Rump: You can’t help anybody. You only know Java.
Erik Darling: I have my doctor Java app and I can write little Java things in there and that’s it. End of it.
Richie Rump: I spent the last 25 years of my career avoiding Java, so there you go.
Erik Darling: It’s not that bad, and it’s gotten better. I mean, 25 years ago, Java today is better than it was 25 years ago. Don’t be silly, Richie.
Richie Rump: Nope.
Erik Darling: Everything gets better – if SQL Server can get better over 25 years…
Richie Rump: Eat your Java beans please.
Erik Darling: I don’t know what that means. It’s probably funny to someone.
If a SQL Server gets more queries, does it get slower?
Erik Darling: Alright, let’s see here. “A user recently complained that some of their queries are taking longer than normal and timing out, looking at our monitoring tool, it looks like there’s more concurrent query activity but not much more can be done if there’s more traffic, correct?” Oh boy. Sounds like you have a…
Richie Rump: Is that correct?
Erik Darling: I mean, on the face of things, no, because you can always tune things more. There’s always a query or an index to be tuned. If it’s a third party app then no, probably not. The only thing you can do is throw some hardware at it, right? More RAM, more CPU, there’s something in there that can be done. I mean, you say not much, I assume that like you know, you’ve kind of thrown your hands up. If it’s a third party app, there’s hardware, if it’s your own internal app then you know, there’s query and index tuning and more hardware, so I wouldn’t say not much. I’d say there’s not much you can do today, but there is stuff that you can start testing and doing over time.
Richie Rump: I would say most things that I have run across with queries, the problem is with the query themselves. So there’s something in the structure of the query that can be adjusted a little bit, whether maybe it’s temp tables or CTEs or there’s something there, maybe a join or two that can be tweaked so it can make the query run faster, and it probably was done by a developer who just doesn’t have much experience with SQL and they just kind of went to joint city and just joined everything together.
Erik Darling: It’s true. That’s a big one, making sure that all your predicates can be searched on. You know, there’s a lot of stuff that I would keep looking at for that, but you know, that’s more than I could fit into a webcast I think.
Should I use columnstore indexes if I have constant inserts?
Erik Darling: Let’s see here. “I have a database, we’ll call it ProductionDB, it has small constant inserts but it’s a pseudo-reporting database. Can I expect, assuming I implement them smartly, large gains in performance by implementing column store indexes?” Jeez, if you’re inserting – I mean, are they a lot of single row inserts or are they lots of large inserts that you can tinker with? Because there is like a weird row number that you have to hit for inserts for column store to take them and make the delta stores and all that other compression thing, fancy language work correctly. If there’s lots of small inserts I would be hesitant to implement column store on it because as far as I know, column store still doesn’t handle writes terribly well, but I haven’t messed with it all that much.
Richie Rump: Sounds like a perfect candidate to test it in a…
Erik Darling: It does sound pretty good. I mean you know, from what I have messed with, column store is pretty rocking technology. The compression is awesome, make your data teeny tiny, lots of aggregation queries are pretty sweet, and you know, if your queries are fairly selective you can get lots of good row and segment elimination – segment elimination I believe it’s called, fancy word for it. So there is lots of great stuff about column store. Whether it’s going to work in this specific use case with the inserts I am not sure.
Richie Rump: Try it, let us know. We would love to find out.
How much memory is SQL Server using?
Erik Darling: Let’s see here. John – someone named John, John Smith, get in the squad car, ask another question. “I’ve assigned 28GB of memory to a SQL server. Task manager of course said it’s using 28GB. Task manager is not reliable for that, how do I tell how SQL server is using those 28GB? Like x percent are going to this thing and y are going to that thing.” There are all sorts of cool system DMVs where you can see what’s in memory. I think there’s like sys.dm buffer memory descriptors or something like that, you can figure out what tables and what indexes are in memory, you can query the plank cache and figure out how many GBs or MBs of plant you have in memory. So there is all sorts of interesting stuff you can look at. I would just check out the DMVs for that. I would just read up on the memory DMVs and figure out which ones are in there. There might be a query somewhere on a blog, I just don’t know – Brent would know if that was on there offhand now. Not that good.
How do you debug parameter sniffing issues?
Erik Darling: In general, how do you start debugging parameter sniffing issues? Well, the first thing I do is I make a temporary stored procedure, I don’t test with local variables and then I figure out what runtime parameters are and what compile parameters are and then I figure out what a plan would look like with each of them. If that’s the same then I start looking for outliers, so I might go into my data and I might start looking for certain values that have a big skew to them. By that I mean if there’s one type of value that has millions and millions or just way more rows than other values in general then I start comparing the plan that that gets by itself – the plans that smaller values get by itself. The classic example is if you have a sales report between Rhode Island and Texas, or if you have the sales reports or procedure you call Rhode Island, it’s going to get one plan, if you call Texas, it’s going to get a different plan. Or, it should get a different plan. It may need a different plan because Rhode Island’s plan would not be sufficient for it. If you call Texas first, you’ll get the big Texas plan and then Rhode Island will reuse that and that might be good or bad, because you might be spending up a lot of additional resources to help out Rhode Island. So there are a lot of ways to go with that, but in general I start looking at common variables to get passed in and if you know, isolated, they get different plans and what those plans are – it could be an indexing thing where like, you know, SQL is too like a weird key lookup plan for a smaller variable because that makes sense, and then when a large variable comes in and starts doing billions and billions of key lookups and that doesn’t make sense for the larger variable, so that could be an index thing where if you can tinker with an index, you’ll take away the key lookup plan as a choice for the large and small variables.
Richie Rump: It sounds like you need to do a little bit of data profiling.
Erik Darling: Little bit.
Richie Rump: When you’re trying to find these parameter sniffing, so if you’re unfamiliar with data profiling, there’s some pretty decent tools out there for it. I wrote a small script that I haven’t touched in a couple years that kind of gets you 75% of the way there, 70% there, called SP Data Profile, but you want to take a look at your data, take a look at your skew and then do all the stuff that Erik just said.
Erik Darling: No, no one’s paying attention there anyway. I feel bad – our transcriptionist is going to fall asleep. Late-stage Groucho Marx is falling asleep.
What causes a query plan to change?
Erik Darling: Nathan asks, “What causes a query plan to change? If all of the variables going in are the same, just the values are different, could auto stats trigger new plans that seem to not be as good?” Well that sounds a lot like parameter sniffing too – could be. So if you’re talking about a stored procedure that takes the same three variables and you pass in the same three variables but then execution plans are different. Well yes, it sounds like auto stats could have updated and invalidated an execution plan and then a weird set of variables may have gotten passed in and you may have gotten one plan and that plan doesn’t make sense for anything else. That could be one thing. If that plan uses temp tables, there’s a recompile threshold with temp tables and store procedures. Paul White wrote the banging-est blog post about it like a decade ago – it’s called ‘Temp Tables and Stored Procedures” if you want to check that out. Just stored procedure uses temp tables, after a temp table changes enough times that something recompiles and you get a different plan and statistics have updated so all sorts of stuff happens, so there’s a lot to look at with that.
Richie Rump: What about an index rebuild, would that trigger it as well?
Erik Darling: Yes, because an index rebuild does update statistics. A lot of other stuff – if there are a lot of table definition changes, you know, stuff like that. If the plan gets thrown out of cache for some reason, like memory pressure, next time you go and run the stored procedure it will have to come up with a new plan and that new plan may have been based on some weird set of variables. I would look into – I guess you can’t really look into that; either it happened or it didn’t.
What’s the best way to find unused indexes?
Erik Darling: Richard asks “what’s the best way to identify unused indexes? What length of time should they be unused before you consider them for delete?” Well, Richard, you know where you are, right? You know that you’re on the Brent Ozar Unlimited Office Hours and that we publish a set of scripts called the First Responder Kit. One of those scripts is called sp_BlitzIndex, and sp_BlitzIndex tells you when you have unused indexes, and that’s a really great place to start. For free.
Richie Rump: For free, and to be honest, those are the scripts that I used first, before I came here.
Erik Darling: I was like dismally familiar with them. Let’s see here, “what length of time should they be unused before considering them for delete?” Well, that’s why you don’t delete them, you just disable them. So then if it comes up that you needed one of those indexes, it’s a lot easier to go back into the GUI and right click and hit rebuild than it is to figure out which index definition you’ve lost entirely. What length of time kind of depends on your application, you know, you could have that index that gets used once a month for an end of month processing report or quarterly for some vice-president report that you might absolutely need but might look unused the rest of the time. So when I start looking at that stuff I like to run BlitzIndex weekly and then look at things and what I like to look for, over longer periods of time than a week, is writes versus reads. So for unused indexes you’re always going to see – things that get flagged as unused indexes in sp_BlitzIndex are always zero reads, but I like to look for ones that also have high writes, because if they have zero reads and zero writes I just don’t care, they can sit there and take up metadata space and not do anything, it doesn’t really bother me much, especially because I don’t rebuild indexes so it doesn’t matter, but if they have really high writes and no reads then I start getting concerned because that’s slowing me down. That’s hurting I/O, that’s hurting modifications, inserts, updates and deletes but it’s not really helping any re-query. So more than any length of time, I like to look at how many writes have accumulated on our index; that’s a good way to go.
What names should Erik and Richie use next week?
Richie Rump: So here’s another question, “next week Erik should be named Sloth and Richie should be named Chunk, or the other way, whatever you guys want, fight over it…” I call Data, but thank you, that’s a good suggestion. Sloth over here is probably…
Erik Darling: I want to take Mouth.
Richie Rump: Mouth…
Erik Darling: He was my favorite.
Richie Rump: That actually is pretty accurate.
Erik Darling: That scene where he’s talking to the maid in Spanish about where the drugs are. That still makes me laugh.
When will I see a GroupBy.org presentation from you?
Richie Rump: Here’s another question, “when will I see a GroupBy.org presentation from one of you fine gentlemen?”
Erik Darling: Well not the next one around because those presentations already got picked. I was sort of intentionally not throwing my hat in the ring for the first couple because I really wanted other people, I didn’t want it to seem like the Brent Ozar Unlimited show for GroupBy, but Brent is actually actively encouraging us to submit for future sessions, so probably the next one.
Erik Darling: Yes, tell me more about sorting arrays efficiently.
Richie Rump: Who does that anymore man?
Erik Darling: I don’t know…
Richie Rump: Let the CS101 guys figure it out.
Erik Darling: I’m basically a CS101 guy, that’s my gig.
Should I shrink my database if…
Erik Darling: Let’s see here, “is there a reason to shrink if you have a non-normal operation that made it grow in the first place?” Well, I’m going to – yes, go ahead and shrink it down. Just truncate the empty space. I assume you’re talking about your data file or your log file, I assume you’re not talking about anything personal. So what I would do is just get rid of the empty space, there’s a truncate only thing you can do where you just get rid of the empty space; that’s usually the safest thing to do. If you – see whenever people talk about the bad stuff that shrinking a file causes, it’s like when you shrink a file like way beyond just what’s empty, when you start mashing stuff things get really bad. Lately when I’ve tested it and I’ve looked at things and I’ve just been getting – like when I do demo stuff and I blow up stuff by doing things then I have all this stuff in my overflow data file that just isn’t used by anything, I’ll truncate the empty space in 1GB or 5GB increments, and then like if I check the next fragmentation after that, like oh, this would make a good blog post, but I haven’t seen anything from fragmentation like that in a while so…
What headsets and microphones do you guys use?
Richie Rump: Meanwhile I’m in derailing mode, so Erik, which make/model headset and microphone are you using? Pros and cons…
Richie rump: Focusrite Scarlett 2i2 – I think that’s pretty standard what we’ve got here.
Erik Darling: It has the one jack, not the two jacks.
Richie Rump: Oh, so you’ve got the Focusrite Scarlett Solo?
Erik Darling: Yes, well it has the big jack and then the little microphone jack and the knob and I can hear myself talk too much, and it has a guitar thing on it, so I guess if I wanted to play the guitar I could do that too.
Richie rump: Yes, that’s the Solo. The audio interface that I’m using is the Focusrite Scarlett 2i2, has two audio inputs; never use them, only need one because there’s only one of me, and I’m also using the Audio Technica ATH-50X headphones with Memory Foam inserts as well as a special skin, you can see it right there, because I’m vain like that, and I’m also using an Audio Technica 2005 microphone which is a USB and XLR. One of the kind of flexibilities of USB or XLR, I think I’ve kind of outgrown that and I only need an XLR because I’m using the Scarlett so much, but if I’m ever on the road and I need a USB, it’s kind of there. So that’s kind of my gear and how we’ve got the setup.
Erik Darling: And that was my chance to tune out. Thanks, Richie.
Richie Rump: Everybody except that one guy is like “what?”
Erik Darling: Good time, good time.
What’s your favorite learning resource for internals?
Erik Darling: Alright, Germal asks a fun learning question. I like this one because I don’t have to get too specific and it’s okay if I NyQuil this one a little bit, “what’s your favorite resource to learn about query processing internals and what do you recommend for getting better at troubleshooting query performance issues?” Well for me, Paul White’s blog posts – when you start reading them they will be a million years beyond your reach, but as you mature they will start to make more sense. Ditto Adam Machanic’s blog posts, they both write some rocket science-y stuff. As far as other stuff goes, the SQL server internals books by Kalen Delaney at all are good. there are optimizer chapters written by Connor Cunningham who used to own the query optimizer at Microsoft and the 2012 and 2008 are two versions of that, pretty much the same so I would just stick to the 2012 one; those are my favorites, those are my crazy people.
Richie Rump: I think I started my journey just kind of paying attention to what people were writing over Twitter and posting on Twitter. A lot of, oddly enough, Brent Ozar stuff – I Don’t know why you’d listen to that guy… Then I started attending a lot of SQL Saturdays and started presenting to a bunch of SQL Saturdays about project management. That’s kind of how I got started, I was talking about project management at SQL Saturdays so I could go in and I could meet the speakers the night before and then meet and hear all the database stuff kind of afterwards, and Summit is a great place I think to learn a whole bunch of different stuff and meet a bunch of different people. So that’s kind of how I came into the whole database stuff, and now I don’t even touch it at all. Thanks, guys, back to my developer roots…
Erik Darling: The gateway to query processing knowledge really is learning about execution plans. So the more you can learn about execution plans the more you can pry into…
Richie Rump: Oh and Grant has a good book on that too, I believe.
Erik Darling: He does.
Richie Rump: Yes I’ve read that one, that was a good one.
Erik Darling: Yes, Grant has a good book on – Grant Fritchey has a book on execution plans, and a long time ago a guy named Fabiano Amorim published a series for Simple Talk where he talked through common execution plan operators, and I used to love reading that over and over again.
Richie Rump: I remember that, that was pretty good.
Erik Darling: yes, it was like 14 or 15 different things about all different query plan operators and what really impressed me so much about those was that this guy could write queries that would produce certain operators, and that’s when it started to make sense to me, like oh no, there’s a logic to this. Like when I do certain things, certain operators are going to pop up and this is why. That was one of the first like light bulb SQL moments I ever had, reading through this like “you can make it do that, that means it does that reliably based on things!” And that was lovely for me.
Richie Rump: So my first ah-ha in SQL server was I was at SQL Saturday Orlando and I sat through an Adam Machanic hour-long session and I didn’t get lost. Right, so this was like a 400 level session, and I’m sitting there and I’m hanging in there and hanging in there, I’m like I got it, I got it, and then at the end it was like “I understood that!” Light bulbs started going off, fireworks behind my head, I’m like wow…
Erik Darling: Do you remember which talk it was?
Richie rump: No, I have no idea. I’m sure I could Google back and figure it out.
Erik Darling: One of my favorite talks ever by him was – it’s on Youtube, it was a TechEd conference talk and it’s called like the Top Five execution Plan Problems That You Can Fix or something like that, and it was like watching – like I’m with you on that…
Richie Rump: Was that the crazy ones where he goes, you know, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10?
Erik Darling: Yes…
Richie Rump: It may have been the same one.
Erik Darling: Yes, and then like all of a sudden it’s parallel and there’s a distinct and the averages and you’re like “I get it, that made sense to me, I can go home and reproduce that!” But yes, Adam Machanic and Paul White are two of the main guys.
What are you learning next?
Richie Rump: I know we’re a little bit over time but let’s get one more, “what are you each learning now for your next blog posts?”
Erik Darling: Learning? Oh boy – I’ve been writing and rewriting a lot of training material for some upcoming videos that are going to be part of our cool subscription re-launch. So I’ve been going over training materials on sargability, dynamic SQL, filtered indexes, computed columns, common table expressions, tons of stuff like that. I don’t know how much of it is going to make it to blog posts because I do still have to record it all, but we’ll see what happens.
Erik Darling: I’m not actually bored, sometimes it’s just weird to be able to choose what you want to do sometimes. Like you have a mission, you have a dedicated purpose…
Richie Rump: I’m on a mission. This is a mission, not a smalltime thing…
Erik Darling: You know, like when I have free time, in other words when I’m not on client time and I get to pick what to do, I’m like First responder Kit, blog, new training materials. There’s like all these directions that I get pulled in.
Richie Rump: Someday I’m going to write a video training and it’s just going to be how to live with Erik, and it’s going to be like an hour or two hours of just that.
Erik Darling: I can’t wait for my wife to watch that.
Richie Rump: I’m mainly doing it for your kid, do…
Erik Darling: Oh she doesn’t know, she barely knows I’m here.
Richie Rump: She will, she will figure out that you’re in a closet, okay.
Erik Darling: She will. Well, it’s not that closet-y, this is like a separate zip code in New York.
Richie rump: It’s closet-y.
Erik Darling: You’re closet-y. We should go – goodbye, thank you for showing up. Wish me well because I’m sick, goodbye.
Wanna join us at the next Office Hours to ask a question? Register here.