[Video] Office Hours 2016/11/09 (With Transcriptions)

Last Updated November 12, 2016

This week, Richie and Erik discuss log backups, migrating from SQL 2008 R2 to 2014, shrinking databases, tuning long-running queries, execution plans, sequencing, encryption, and much more!

Here’s the video on YouTube:

You can register to attend next week’s Office Hours, or subscribe to our podcast to listen on the go.

Enjoy the Podcast?

Don’t miss an episode, subscribe via iTunes, Stitcher or RSS.
Leave us a review in iTunes

Office Hours Webcast – 2016-11-09

Erik Darling: Let’s start with some questions. We have one from fellow human asking if there is a way to add an additional node used for read-only access to a SQL Server 2008 cluster. No, sir. No how. No way. Unless you log ship to another node—well that wouldn’t be really a cluster node, that would be a separate server. You could log ship or using mirroring and take snapshots of the mirror. That is the only way to do that really in 2008 R2. You need availability groups or one of those things, replication, or something else, but it wouldn’t really make sense to replicate to another node in the cluster. So, no, short answer. Long answer: you would need a separate server with some different technology in the middle.

Erik Darling: Fellow human asks, “Sp ms ad repel commands is blocking heavily in the distributor.” Cool, good to know. I’ll alert the media. Tara is not on yet, I think her answer might be to either make sure you have Read Committed Snapshot Isolation turned on and then if you already have that then you might want to just kill and restart it. I do not know much more about replication than that.

Erik Darling: Let’s move on and ask a question that I have a good answer to. There is one about backups. “If I take a log backup while a full backup is running and another log backup after the full backup finishes, how would I restore the database with all log backup files?” Geez. That is such a simple thing to test, right? Take a full backup, take a couple log backups. So the answer is log backups taken during full backups don’t truncate the log file. They have to wait for the full backup to finish before that happens. So you would just need the full backup and the log backup file taken immediately after the full backup. The one taken during the full backup doesn’t really have much in it.

Erik Darling: Another question, “I used CrystalDiskMark to determine the speed of the disks between servers, including one we just purchased.” They asked the SAN tech why the speed from our drive was “six times faster than the currently purchased drive.” Oh, boy. I don’t know. You are in a particularly weird situation. I don’t know if CrystalDiskMark is giving you false data. You could try a utility called Diskspd, see if you get any different results from that. I don’t know what kind of SAN you’re using or where they got that information from so it would be hard for me to comment on it.

Erik Darling: I have another question here from Tammy that I don’t have an answer to. “I’m looking for a way to differentiate SQL users without logins versus orphaned SQL users.” Don’t know. I’ve never had to do that. Couldn’t tell you.

Richie Rump: [Inaudible] using T-SQL, are those users that are in the database but not attached to the server?

Erik Darling: Something like that, yeah.

Richie Rump: I used to have a script way back when, in a galaxy far, far away that kind of identified those and deleted them.

Erik Darling: Oh.

Richie Rump: I don’t think I’ve got them. Because that typically happens when you do a backup and restore from one database to another.

Erik Darling: Right. You know what, that jogged something in my mind… Is that what it is? Oh yeah. So there’s a Microsoft stored procedure that can, maybe the code in there can lead you in the right direction. It’s called sp_help_revlogin. I’ll stick the link to it in chat for everyone if they want it.

Richie Rump: If he could figure out how to paste things from the…

Erik Darling: Control V is hard. Yeah, that’s the best advice I have for you. Check out the code in there, it might get you somewhere on the way.

Erik Darling: “In SQL Server 2014 the DAC port seems to be completely random to me. How can I fix the port to a specific one, especially with multiple instances on a machine?” I don’t believe you can. I don’t believe that is a configurable option. I think the remote DAC just shows up, my question would be, does the SQL Server port also always the same? Richie, anything good? See any good questions in there that you want to throw out?

Richie Rump: I think they’re all great questions.

Erik Darling: No you don’t. You’re lying.

Richie Rump: Our viewers do not ask poor questions. They’re all wonderful, just like they are. Unlike yourself, Erik.

Erik Darling: Unlike me, unlike me.

Erik Darling: We’re just going to keep going on down the line then. “I have a few SQL agent jobs that are disabled but ran as scheduled anyway. MS db sysjobs shows these jobs as enabled = 0. Do the schedules on these jobs need to be disabled as well?” If they’re maintenance plans, I believe so. Also, sometimes you can chain jobs and some SQL agent jobs can call other SQL agent jobs. So I would be on the lookout for that as well.

Richie Rump: Chain-smoking jobs? Is that what you’re…?

Erik Darling: Yes, chain smoking, but we don’t know what. Could be anything. Hopefully legal.

Erik Darling: Fellow human asks, “When migrating from SQL Server 2008 R2 to 2014, should the flag 4199 be enabled globally or not?” Well, that trace flag contains a lot of optimizer fixes, whether you’re running into issues where those fixes apply is heavily workload dependent. There’s not really a good yes or no answer for that off the bat. The best thing I can tell you to do is setup a 2014 test server. Try your queries with the trace flag on and trace flag off. Microsoft is changing a bit how that trace flag works where it’s just going to bucket optimizer fixes up to current on different versions. So 4199 actually contains different sets of fixes between versions. It’s just a bucketizer trace flag now for optimizer fixes or changes. It’s not like the steady set of fixes and changes that it was prior to 2014. The other thing that you’re going to have to take a good, careful look at is if the new cardinality estimator is helping or hurting your queries. So make sure that you test both things in conjunction. Make sure you’re not getting something weird because of the new cardinality estimator. Make sure you’re not getting something weird because of 4199. You have your work cut out for you. That’s a tall order of testing depending on how much code your application generates.

Richie Rump: I haven’t messed with the new cardinality estimator because I’ve been stuck in dev-ville. Have you seen anything in the wild where the new cardinality estimator would take something, a good plan, and then turn into utter garbage?

Erik Darling: I haven’t seen that just because we’ve had precious few people a) using 2014 b) using it in the higher compatibility mode and c) with anything to compare it to in a prior mode. The few times that I’ve seen 2014 acting like 2014 it’s just been sort of trying to assess how that’s working and not really being able to compare it to anything else that happened before for them. There’s stuff that I really like about it. I think it’s kind of a good idea that there is some correlation involved in your where clauses. So if you’re comparing columns in the same table, so you could assume there is at least some overlap between values in one and values in the other and they’re not just two distinct cardinality estimates. So I think that’s cool but that may not work for everything. Customer ID 1 may have never purchased anything on a certain date and Customer ID 2 may have made all their purchases on a certain date. So it’s good to assume that there’s some correlation but maybe not as much as they do.

Richie Rump: Right, so kind of our recommendation is try it but test it.

Erik Darling: Yeah.

Richie Rump: Along with everything else.

Erik Darling: What’s the difference between trying it and testing it?

Richie Rump: You could just try it and throw it in production. What could possibly go wrong? Testing it is like, “Oh well, we’re not going to throw it in production. We’re going to do it in a staging environment. We’re going to run some stuff up against it.”

Erik Darling: Like jumping out of planes.

Richie Rump: Without a parachute. What are you? Captain America?

Erik Darling: There’s a question here from someone who has auto shrink set to on. They turned it off and their database grew. Stop shrinking your database. That’s it. That’s the only answer for you. Stop shrinking your database. If it’s a one-time thing, it’s a one-time thing, but you’re doing it repetitively. You want to keep shrinking it. You want to keep beating that database up. You want to keep beating your whole server up by shrinking your database. That’s not good.

Richie Rump: Because every time you shrink a database, Paul Randal kills a kitten.

Erik Darling: He does. Not quickly either. It’s like a slow, like days long thing.

Richie Rump: I actually did not know that. I thought it was more like a [makes sound] but no, he’s… oh, that’s terrible. Yeah, don’t shrink.

Erik Darling: Yeah, he starts with the tail.

Richie Rump: No, no, poor guys.

Erik Darling: Here we go. Here’s one that I like. “When tuning a long running query, what are some of the low-hanging fruit to watch out for in the execution plan?” Richie, do you have any favorites?

Richie Rump: A long running query? I like to see those scans. I like me some scans. I like me to take a look at and maybe I’ve got some nested loops with some very large numbers going into it. Maybe I’ve got the wrong join type there. Maybe it should have been some sort of match or hash join as opposed to that nested loopy thingy.

Erik Darling: Yeah, buddy.

Richie Rump: I’m going to take a look at maybe some functions, some aggregates there. Maybe there’s something going down there. Those are some fun things that I don’t get to do anymore.

Erik Darling: Yep, ain’t no scans in the cloud.

Richie Rump: But it’s better because we don’t have any of that stuff in the cloud.

Erik Darling: That’s true. I agree. You don’t have to deal with it. Everything just works fast. It’s so nice. Yeah, we have sort of an index on three things, mostly there.

Richie Rump: We’re not joining tables together. We are absolutely not doing that.

Erik Darling: No, and we have mostly the same data in all three places. Give or take.

Richie Rump: Kind of, eventually.

Erik Darling: Someday we will have it there. When I’m tuning long-running queries, the stuff that I like to look out for is spills to disk. That’s a good one because when you spill the disk you slow yourself way down, so look for hashes or sorts that spill. My big pet peeve is key lookups because key lookups infer that you are doing something horrible many, many times. Sometimes they’re fixable if you’re not getting too many columns back from the clustered index or the times you have to kind of balance, how important is this query, do I want a super wide index to cover everything so I’m not doing a key lookup?

Richie Rump: What if I don’t have a clustered index? What’s the lookup there?

Erik Darling: It goes to a heap and then you’re in double trouble because if the heap has forwarded records in it, you can end up doing a key lookup plus a forwarded records search.

Richie Rump: But wouldn’t it be a RID lookup at that point?

Erik Darling: Yes.

Richie Rump: I remember something from a Brent Ozar class once.

Erik Darling: Yes, good for you. From a pre-con long, long ago.

Richie Rump: A long time ago, somewhere in Atlanta, back in 2012.

Erik Darling: Let’s see, other stuff that I like to look for, probably lazy spools are a good one. Lazy spools are an interesting thing where SQL attempts to cache a query result in tempdb and maybe reuse it, maybe not. So if you see a table spool and it’s rebinding a lot, then that means SQL is re-caching data in tempdb. Stuff like that is a pretty good start, you know. The old missing index request is a good one. Let’s see, off the top of my hand, implicit conversion maybe, scalar functions.

Richie Rump: Implicit conversion, which is really hard to find.

Erik Darling: Yeah, it’s really tough. There’s no warnings about that one, anywhere. One thing you might want to try doing is giving things a look through with sp_BlitzCache and seeing if it warns you about anything because we do warn about a lot of this stuff using sp_BlitzCache. So if you want to grab like the plan handle or the SQL handle or something, run it through sp_BlitzCache and see what you come up with.

Erik Darling: Wow, okay, here’s a weird one. “Do you have any policy templates for things like, ‘No you can’t have SA for your application?’” Okay, so I’m not sure if you’re asking about policy-based management or if you’re asking for just like a polite response to why your application can’t use SA. If you follow up on that, I can maybe tell you that a) I’ve never used policy-based management or b) if an application is asking for SA, then put it in writing that this application can make all sorts of changes to the server and do anything it wants and that you cannot be entirely responsible for anything that the application does because if someone drops a database from the application, if there is some sort of SQL injection-y thing going on where someone does something horrible, it can’t be all on your shoes because that was what the application was granted and you fought against it and someone said yes anyway. So heads up there. Oh, that’s a long one.

Richie Rump: And no, you shouldn’t have SA.

Erik Darling: No, you shouldn’t have SA.

Richie Rump: You shouldn’t have SA.

Erik Darling: There is a very long question in here about, “What factors can change the execution plan, especially for compiled and prepared parameterized query?” That is a long question. Head on over to dba.stackexchange.com, post it there. Post the execution plan on our site. If you go to brentozar.com and under free stuff—I just want to make sure I’m saying the right thing—under free stuff, there will be a link to Paste the Plan. You can paste the good and the bad execution plan and you can post that on dba.stackexchange.com. You can get a much better answer than me trying to decipher this Bible paragraph of text.

Richie Rump: Thou shall give bad execution plan.

Erik Darling: Yes, and then a nested loop join. We’ve got a hash match.

Erik Darling: All right. What? “What’s the best way to downgrade a 2014 database for restoring to a 2008 R2 server without building out a 2014 server?” You don’t.

Richie Rump: Yeah.

Erik Darling: Yeah. The only thing you can really do there is restore the database on a 2014 box, it’s not going to be restorable to anything lower than 2014. Then you’re going to have to script the data move out or use some other functionality to move the data over to a lower version. You can’t directly restore or attach higher version databases to lower version. There’s a whole bunch of upgrades and changes that go in when a database is moved to a newer version or created on a newer version. It expects to be on that version or higher. So, no such thing.

Erik Darling: Here’s one, “Is shrinking a data file resource expensive and/or affect other dbs in a negative way?” Yes. It takes a lot of CPU and it takes a lot of disk IO to shrink a database, especially depending on how big it is and how big the shrink is. It takes up a lot of resources to that. So yeah, I wouldn’t haul off and just keep shrinking the database. If it’s low use and going to be archived and decommissioned anyway, then just stop messing with it. Just leave it alone.

Richie Rump: Leave it alone, man.

Erik Darling: There’s a question here about sequences over identity columns. Have you used sequences much?

Richie Rump: Yes, I’ve messed with them, sure. Yepper.

Erik Darling: What are your feelings on them?

Richie Rump: If you’re using a sequence, it’s mainly that it’s some sort of knowable type thing. I wouldn’t go to a sequence if I need an auto-numbering type system, like most databases have these days. Identity column is exactly what you need. Sequences, like say you want to skip numbers or something of that, or if you need to loop back, sequence does a looping thing too and it does it pretty well. So there’s a lot of cool things that you could do to it but it’s not for like a unique ID of a row or something like that. If you want to do something that’s kind of numeric but you need to do something funky with it, sequence is probably your way to go as opposed to an identity. But if I want to uniquely identify a row, identities is what you need to go with.

Erik Darling: So for sequences, I think they’re pretty cool, especially if you need to hand out unique numbers to a few tables. So like multiple tables can reference a sequence and kind of get values from it. So if you need to do that, that’s kind of nice. Although the circumstances where you’d need to do that probably are kind of weird.

Richie Rump: Yeah, I would question that architecture at that point and really ask, “Well, why do we really need to do this and why is the structure in this manner?”

Erik Darling: The other thing to be careful with sequences is sort of like when if you go to like bulk insert or insert a bunch of data to an identity column and it fails, you lose all those identity values. SQL is like, “I was going to give those out… but then no, I’m not going to give those out…” So with sequences, depending on how many you cache, if you reboot or if you lose a transaction, you could also lose a whole bunch of sequence values. So if you’re depending on ordered numbers, ding, ding, ding, one through, everything unique and in order then neither one is really perfect for that either.

Richie Rump: Nah. You shouldn’t be having your clustered index on those anyway, if that’s the case. If that is indeed the case, then you should probably have that as some sort of alternate key or something.

Erik Darling: Fellow human says that they had an outage last night and they were seeing lots of the following error, “Time out occurred while waiting for buffer latch.” There’s some page information and there’s some seconds and it says, “Continuing to wait.” I’ve never seen that error either. I would definitely post that on dba.stackexchange.com. I’ve never seen that one. I wouldn’t have a good answer for you.

Richie Rump: Hmm, was there a site that went down last night that got a lot of social media play?

Erik Darling: I don’t know. I don’t pay attention to the internet at night.

Richie Rump: Maybe.

Erik Darling: Let’s see here. “Is there a tool to check for find compatibility issues before changing compatibility level for a database?” Hmm. You can check the Microsoft data migration tool… yes, the Upgrade Advisor. Yes, there we go. Microsoft provides a tool called the Upgrade Advisor. It also has a couple lists of breaking changes and stuff, a couple websites that list breaking changes between versions that might be helpful to you.

Richie Rump: Yes, identity also loses, sometimes loses value, yes.

Erik Darling: Yeah.

Richie Rump: Correct.

Erik Darling: I said that. I said. I said if you try to do an insert and it errors out, you can lose values.

Richie Rump: Yeah, you can lose it, but then again, it’s an identity column and why should we really care?

Erik Darling: No, it is a surrogate key. If you want your data key to be an identity column, you have issues. Issues beyond compare.

Erik Darling: Going on down the line, “Is there a guide/list about basic errors which should be set up as alerts on a common server?” Yes, if you go to brentozar.com/go/alerts there is a list of errors that I believe Jeremiah compiled way back when. It covers the high severity errors which are 16 through 25, which is the stuff that you should probably be paying attention to. Some of the lower numbered ones do get a bit chatty, like some of the stuff below 19 can be a bit chatty with like failed logins. So you can make your own choice on whether you want to get all those emails or not. It also covers some pretty important corruption errors, 823, 824, and 825, which cover some of the hard and soft IO issues that you have going on. But you have to make sure that your databases are set to have page verification turned on I believe in order for those to work because they check [inaudible] alerts on the checksums there.

Erik Darling: All right. We do not have any other questions in the queue.

Richie Rump: I wonder if the folks online can figure out what kind of keyboard you’re using by your keystrokes when you’re googling, that would be interesting.

Erik Darling: No, probably not. It’s a very obscure keyboard. Just kidding. It’s not very obscure at all.

Richie Rump: You’ve got a standard Microsoft keyboard.

Erik Darling: No. I got this Logitech one that doesn’t have key wells. It’s like you can dunk it in water if you wanted to clean it. I like it because the keys are kind of thick and chunky and they have like good noisemaking but they don’t…

Richie Rump: Mechanical. You’ve got to go mechanical, man.

Erik Darling: Yeah, but I have a beard and I eat lunch in here. So it’s not happening. Because when I clean out this keyboard that doesn’t have wells, it’s already not pretty.

Richie Rump: I cleaned out mine this weekend. I pulled all the keys and vacuumed it and did all that stuff because the amount of cat fur that gets in this office because of cats just sleeping on my desk is crazy. And yes, thick and chunky keys. Those are important. It’s almost like dice, you like your thick and chunky dice, thick and chunky keys. They’re important.

Erik Darling: “If we suddenly needed to encrypt some propriety data on our SQL 2008 and 2014 instances, what would be the easiest way to do that?” Well, encryption covers a lot of different stuff. Shane just brought up that you can encrypt backups. If you’re on Enterprise edition you can use TDE to encrypt your data at rest but that doesn’t help your data in flight. Then, let’s see… no, that’s 2016, so that doesn’t count. Those are basically your options aside from buying a third party tool that encrypts stuff. Richie, how would you encrypt some sudden data?

Richie Rump: Like you said, it depends on the requirements that we need. If it just needs to be at rest and everything else is cool then typically TDE would be—and I have Enterprise, right—then TDE would be the way to go. It’s fairly easy to configure. Don’t lose the certs, right, those type of things. You could actually implement encryption on the app side, of which I’ve done multiple, multiple times. Fairly easy to do, keeps the encryption out of the server which means all the CPU cycles happens on an app server which is probably beneficial since you’re paying so much money for your SQL Server licenses. But then again, if you ever want to get data in and out, it’s got to go through the app server, so there’s definitely pluses and minuses to both.

Erik Darling: Yeah, one of the really tough things about encryption is that, especially prior to 2016, TDE can kind of break a few other things, like you can’t compress your backups anymore. You can still compress columns and data but aside from that you’re a little bit out of luck.

Erik Darling: “What do you recommend for data masking? Social security numbers, accounts information?” Probably just hashing the columns, that’s about what I would do. Like if it’s just a column-level thing, I would just either hash it and have a [inaudible] somewhere. Or create a computed column that hashes the columns that you’re interested in encrypting and don’t let anyone at the root columns.

Richie Rump: Right. The other way you could do it is physical partitioning, right? Create separate tables for the different levels that you want.

Erik Darling: Or views.

Richie Rump: Or what?

Erik Darling: Views. You can use views so people only access the views and the critical columns aren’t included in the views.

Richie Rump: Yeah, absolutely.

Erik Darling: All right, before we take off, fellow human said, “Did you answer my question about buffer latch problems?” No. I said to post that on dba.stackexchange.com. I have not seen that error personally and I wouldn’t want to speculate. So I would post that over there to get some more eyeballs on it.

Erik Darling: With that, we are at the 45-minute mark. Richie is going to go continue to run around nude on his lawn yelling “Cubs win!” instead of working.

Richie Rump: Yes, with only my W flag because that’s the only thing I’m wearing now for the past seven days.

Erik Darling: Only the W flag and your imagination. All right everyone, see you next week. Thanks for coming.

Richie Rump: See ya.

Migrating from SQL Server to MySQL, PostgreSQL, and Aurora

Stupid Things I’ve Done With SQL

4 Comments. Leave new

Greg
November 14, 2016 5:17 pm

In the podcast you said “If you want your data key to be an identity column, you have issues.”. We need to create unique sales invoice numbers for a website (multiple application servers). Back in SQL 2005 the business started using Identity columns to generate the IDs. It made sense because the increased by 1 and were pretty sequential (you lost a few if a transaction rolled back). When we moved to 2012, we had to turn on trace flag 272 so that we don’t jump 1000’s of ID’s at a time on a restart or failover (finance auditors apparently don’t like huge gaps like that). Is there a better option than using an Identity column to generate these sorts of ID’s to be unique across multiple web servers? Is that a use for Sequence?

Reply
- Randolph West
  November 15, 2016 12:02 pm
  
  Unique across multiple web servers? I’d create a table of unique sequential values, and a IsUsed bit column, and dip into that when I need a new one. Make sure I’ve locked the table when accessing it, so that it doesn’t create a race condition.
  
  Reply
- Noah Engelberth
  November 15, 2016 1:51 pm
  
  One way I’ve seen in some billing apps is to have a “keys” table that gets used to track what the next Purchase Order #, Invoice #, etc should be. And then the transactions wind up managing incrementing that value when a new item is created.
  
  As you should expect, that’s not really a high performance way to do it, since now all of your creation transactions depend on a row in that table, meaning you can’t really create two invoices (or two purchase orders or whatever) at the same time – the first one in will block the rest until its transaction completes or rolls back. But hey, just send the auditors a bill for all your extra SQL CPU cores, right?
  
  Reply
RD Francis
November 17, 2016 9:02 am

For the person with the disabled job that ran: If possible, check the job history. If the job was kicked off by a schedule, it’ll say which schedule kicked it over. If it was kicked off manually or programatically, it’ll say it was kicked off by a user.

Reply