This week, Brent, Angie, and Tara talk through your questions about monitoring tools, transactional replication, configuration management, source control software suggestions and much more!
Office Hours Webcast – 2016-06-15
Why did some of my drives disappear?
Angie Rudduck: Bruno says he has an instance with several DBs and suddenly a couple of them became unavailable and the physical disks where the data log files disappeared. No Windows events. How can he audit what happened at SQL Server level?
Tara Kizer: It’s unlikely a SQL Server problem. I’ve encountered this many, many, many times. You’ve got to talk to your sysadmins or you’ve got to talk to your SAN admins, server admins, they’ve got to take a look. Something happened. It’s almost certainly not a SQL Server issue.
Angie Rudduck: Yeah, if your drives disappeared, it’s probably not SQL Server’s fault.
Brent Ozar: When the drives disappear, I don’t know that you would see a Windows event unless there’s some kind of error message that pops up from the SAN or whatever. I’m assuming it’s a SAN.
Tara Kizer: You would eventually see a SQL Server error once it finally has to write to disk. I mean, it’s going to be a little bit before that happens since SQL Server does everything in memory. So it’s not going to know about it for a while. But the checkpoint, any kind of writing to disk. It’s finally going to start throwing errors and those should be posted in the event log.
Brent Ozar: Backups.
Tara Kizer: Yeah. We’ve encountered weird bugs on like Cisco hardware that caused it and just various weird things. But it has happened numerous times, across many servers, many different hardware platforms, different SANs. It just happens.
Brent Ozar: I think it’s usually it’s just human error. I mean like Robert Davis, a fellow Microsoft Certified Master, just ran a blog post on how he’s like, “Even I screw up.” Drops the wrong one and all these database transaction logs disappear.
Angie Rudduck: Oh yeah, I’ve dropped two databases from prod before. Two separate occasions I have dropped a database from prod. Thankfully both were quick enough recovery. The second one turned out not really used, so that was okay.
Brent Ozar: It’s a matter of time. That’s how you become senior too, you have to have those experiences.
Angie Rudduck: I was just going to say I feel like something that, I’ve met people having only been a DBA for three years, I run into people who have been DBAs for ten years and I know things they don’t only because it’s things I’ve experienced that they never did because maybe they were a smaller shop and I worked in bigger places. Just all about what experience you had.
Brent Ozar: Yeah, everything that involves replication. Tara knows everything.
Angie Rudduck: Somebody already, “Since Tara is here, blah blah replication” question.
Brent Ozar: Here we go.
What’s the best SQL Server monitoring tool to use?
Angie Rudduck: Constantino—I butchered your name, I’m sorry—he has a long-winded, easy question. Basically they’re trying to look for a good monitoring tool for production servers. They’re looking specifically for health monitoring that can alert them when something happens or is going to happen. So don’t get Ignite, it’s not in your list, but don’t get Ignite. He’s looking for a full-rounded solution. They’ve tested a bunch: Spotlight, Foglight, Redgate, SQL Sentry, Idera. Do we have any favorites that we would point them to for health monitoring and for SQL alerting?
Tara Kizer: SQL Sentry provides both with the performance advisor and then the event manager tools I believe. Both of those together can give you everything you need. At previous jobs, we used SQL Sentry at my last job and previous jobs we used Spotlight. I wasn’t a big fan of Spotlight. It was great for looking at certain things. I did set up some availability group alerts but it wasn’t as comprehensive as I wanted. We also had Foglight which I think is now called Performance Analysis. Then we had SCOM, so Microsoft’s System Center Operations Manager with the SQL Server management pack. But SQL Sentry, their two big tools did replace SCOM and the performance analysis tool for us at the time. But it’s pretty expensive. SCOM plus another tool is not as expensive. But SCOM requires almost a fulltime monitoring person that knows how to use it. It’s very complicated.
Angie Rudduck: Yeah.
Brent Ozar: I’ve used all of those too. I’m fine with all of them. It comes down to personal preference.
Tara Kizer: Yeah.
Angie Rudduck: Did he mention Dell’s? That’s Spotlight, right? Dell is Spotlight.
Tara Kizer: Yeah, Spotlight and Foglight. Foglight is the name that we used to call them. I think it’s Performance Analysis, I think. People may still refer to it as Foglight.
Brent Ozar: They renamed it again.
Tara Kizer: Oh they did? They went back to Spotlight?
Brent Ozar: Yes.
Tara Kizer: Oh, I didn’t know that. They were probably sick of people calling it Foglight and they’re like well we should just call it that too.
Brent Ozar: A friend of mine calls them FogSpot. He’s like, “I don’t know which one it is. I’ll just call it FogSpot.”
Tara Kizer: Yeah, one of them.
What should I do about the replication error “undelivered commands”?
Angie Rudduck: All right. So we will move along. Let’s see—not that one—we will go to Nate, with the transactional replication. They have a setup where often they get stalled transactions from the “alert for undelivered commands job.” Any thoughts?
Tara Kizer: Stalled transactions. I’d probably need to see the full error. So undelivered, so it probably means that it’s sitting at the distributor and it hasn’t been sent to the subscriber. I would take a look at the throughput. Take a look at the distributor and the subscriber to see if there’s any kind of CPU issue, possibly it’s just a lot of data got pushed through. Yeah, I don’t know for undelivered commands. Usually it’s a bottleneck on the publisher with reading the transaction log. Maybe there’s just a lot of stuff in there, you’re not backing it up often enough so the amount of data that has to go through is bigger. Mirroring, availability groups, and—well those can add to replication latency because everything gets stored in the transaction log.
Angie Rudduck: All right. So I realized I missed this very small question from Greg, so I will give him some attention. He said he saw some tweets recently that stated you should have four cores per NUMA node. What do we think about that?
Brent Ozar: Somebody was pulling your leg. It’s not configurable. It just comes down to for Intel processors it’s the number of cores per processor. If you turn on hyperthreading, it’s going to suddenly magically double. There are differences under virtualization, unfortunately, it’s such a huge topic you can’t possibly say, “You should always have four cores.” It depends a lot on the host hardware that you’re using and whether or not that hardware is identical across all of the hosts in your cluster. But yeah, anybody who just says four, they’re over simplifying something. Or it might have been for just one particular guy’s setup, like if one guy had just one host design.
Angie Rudduck: Yay for answers where people are pulling your leg.
What’s the best way to create a SQL Server inventory?
Angie Rudduck: Okay. Samuel wants to know, “What is the best way to create a SQL Server CMDB/inventory without buying third party software?”
Tara Kizer: I don’t know what that is.
Brent Ozar: Configuration management. Idera just had a new tool. If you go to Idera.com and click on free tools, I want to say it’s Instance Check, it’s got something along the names of inventory in it or it’s not discovery, but it’s something along the lines of inventory. So go to Idera and click on free tools. The other thing to search for is Dell Discovery Wizard. Dell Discovery Wizard will go through and survey your network and discover SQL Servers and identify them for you. Put them into a database. Another tool that you can use is SQL Power Doc. SQL Power Doc is an open source PowerShell script from Kendal Van Dyke. If I had to pick one that I like, I used Dell Discovery Wizard a bunch of times. Idera’s looks pretty easy as well. SQL Power Doc, not very easy, but super powerful.
Angie Rudduck: Very good.
Should I use multiple Availability Groups or just one?
Angie Rudduck: Eli has a question about availability groups since Bret Ozar II isn’t here. They’re planning on upgrading from 2008 R2 to 2014 to take advantage of availability groups. They would like to know if there is a performance advantage to having databases spread across multiple AGs instead of one single AG. His example is having the primary on one AG but be on a different node than another primary to take advantage of the hardware.
Tara Kizer: Yeah, I mean, definitely. The first part of your question, there is no advantage to spreading them across multiple AGs unless you are putting the primary on a separate replica. But you know, you do have licensing implications in that case.
Angie Rudduck: Curse licensing. Always out to get you.
Brent Ozar: That was a smart question. I’m glad he said move the split around different primaries because I was like, “No, there is no advantage—Oh yeah.”
Tara Kizer: There is an advantage there.
Angie Rudduck: Tricky wording there.
Why am I getting an external component error when installing SQL Server?
Angie Rudduck: Kimberly, welcome back, I haven’t seen you in a bit. She is installing SQL Server 2008 R2 on Windows Server 2012 R2. This is compatible based on MS docs she checked. However, she’s getting the “external component has thrown an exception error.” What is she missing?
Tara Kizer: I wonder if there is a prerequisite that you need to install first. At least on older versions of SQL Server and Windows it was supported on newer versions of Windows but you had to install something first. I don’t remember what it was and I don’t know that that’s why you’re encountering this error. This is the type of thing that I’d probably open up a support case with Microsoft.
Brent Ozar: The other thing, go download that, whatever the ISO file or the [XE 00:09:32] that you got for the download, go download it again and just save it to another place and try it again. Because I’ve gotten so many funky setup errors just from a corrupted ISO file. Then when I go and get another ISO, like bloop, works perfectly. I’d also say anytime you get them, I’m kind of paranoid like this, but anytime that you get an error during setup, I’d rather wipe Windows and start again. I’m just paranoid. I want to build something that’s going to last forever. So if you’re having repeated errors on the same Windows box, hey, go download a new ISO and then try again on a fresh install of Windows.
Tara Kizer: You can go through the setup logs to see if there’s a better error because that’s probably a pretty generic error. The problem with the setup logs is it’s hard to find the errors. Scroll all the way to the bottom and then you might have to start scrolling back up to see the failure. Because even though it failed, it might have done a lot of extra work afterwards and all of that is going to be logged.
Brent Ozar: There’s like 50 million instances of the word error in the log.
Tara Kizer: Yeah, exactly, it’s awful.
Angie Rudduck: I do like the trick that I learned about filtering the Windows log, starting here during triage. I had no idea about that and then one day when I watched you do triage and you right clicked on the left side bar, I was like, “What? I only knew…” Because half of the time during triage I have to ask the client to move my head because it’s always floating right over the filter log on the right panel in Windows events, so that happens a lot. I’ve been trying to work around not asking them to move my head because it sounds weird to me.
How should we do source control with SQL Server?
Angie Rudduck: Since we’re talking about a lot of software, let’s ask another question from Scott. Do we have any suggestions on source control software? When Richie is not here of course.
Tara Kizer: Yeah, I was going to say, Brent and Richie love Git.
Brent Ozar: So there are two ways you can do this. One is you can do source control before you deploy your code, meaning you go make a stored procedure change, you check the code into GitHub or Subversion or TFS, whatever tool you want to use. That’s proactive. Then you go deploy the code after you’ve checked it in. Man, almost nobody does that. Plus too, you end up having DBAs who need to change indexes in production or need to make an emergency change. So the other way you can do it is reactive source code control which means this tool goes out and looks at your SQL Server every x number of hours and then goes grabs any changes and checks them into source control. So this gives you a postmortem log of everything that changed but not who did it and not the exact time that it changed. I am personally a fan of reactive source control as a DBA. I don’t really care as much about who did it but I want what was changed. I want a breadcrumb list of everything that changed on objects. So Redgate’s source control for SQL Server has that ability that they’ll just go through and patrol your SQL Server periodically and check in any changes. It’s just not the source control that your developers are used to. That proactive change control is really, really hard.
Tara Kizer: We did both proactive and reactive at my last job. We used Visual Studios Team Foundation Server. Anytime we did deployment of the application, that was always proactive. And of course, DBAs, you know, are having to make changes. The DBAs were supposed to go in and do a schema compare and then update TFS. That didn’t always happen. Other tasks were more important. So whoever that next person was that touched that database, when they did the schema compare to create the deployment scripts, they would see that ther are these other things that shouldn’t be in my deployment that they’ve already been deployed to production but weren’t in source control. Besides that though, because you could have databases you never touch again. So besides that, twice a year they went through and did a schema compare against all databases and got them up to date.
Brent Ozar: Scott asks, “I didn’t know about reactive source control. Who makes it?” It’s a technique, not a product. It’s just part of Redgate’s source control as well. I think I even still have a blog post on our blog about how you do it with a batch file. Like I wrote a batch file in VBScript to do it with Visual SourceSafe. I need to burn that in a fire.
Angie Rudduck: That sounds complicated to somebody who’s going to try and totally mess up. That was cool. I was about to ask Tara if you could do them together. So that’s cool that you have seen them both together because I was like I feel like one place we didn’t consider indexes, we didn’t let developers change indexes. So if a DBA throws them in and then doesn’t check it in, that would be great to have the reactive right there.
Tara Kizer: Yeah, as long as you have a schema compare option in the tool that you use. Or you can get another schema compare. Then you can see what the changes are between source control and your database.
Angie Rudduck: Very cool.
What’s the fastest way to modify a big table?
Angie Rudduck: J.H. wants to know, “What is the fastest and/or safest way of exporting a large table and then reimporting it and maintaining its primary key auto identity seed ID … SELECT * into temp table from large table or bulk copy out or something else?”
Brent Ozar: Okay, so I’m going to tell you the terms to google for: modifying a table online Michael J. Swart S-W-A-R-T. So because you said fastest, Michael has an interesting set of blog posts, it’s like a five-part blog post on how you go set up the new table, how you build something to keep the old and new table in sync and then you move data across in batches. So this way end users notice very minimal downtime and yet you’re able to keep the two in sync as you get larger. The only time I would go that route is if, “You cannot take any down time. We’re willing to let you put a whole lot of development work into it” and it’s more than like say 50 gigs in one table. If it’s less than 50 gigs in one table, I would probably just do a select with a tablock and move the data across that way.
Tara Kizer: Then you can use the identity insert option to handle the identities. That way you keep the values the same between the two tables. So SET IDENTITY_INSERT ON. You can only have one table at a time do this so make sure you set it off when you’re done.
How should I manage identity fields with replication?
Angie Rudduck: That’s a perfect follow into Paul’s question. He has existing replication where he wants to change the identity management of primary keys have identity 1 1. He wants to change the primary keys to identify 1 2 on the publisher and identity 0 2 on the subscriber. Is there a way to do this without recreating the tables?
Tara Kizer: You do have the DBCC command where you can change the seed but I don’t think that you can change the increment. Usually, in a scenario like this though what people do is they have the publisher, it’s inserting positive numbers and then on the subscriber inserting negative numbers. So you would have, you know, if it’s an integer, you could have two billion rows for the subscriber and two billion rows in the publisher. That usually satisfies most tables. Otherwise, go to bigint.
Brent Ozar: So there’s a DBCC command to reseed the identity. I cannot remember for the life of me what the syntax is but if you search for that.
Tara Kizer: Yeah, I think it’s [Inaudible 00:16:26] IDENT is the command.
Brent Ozar: Yeah, you just run that on the one where you want to change them.
Angie Rudduck: Good info.
Should I use checksum when taking backups?
Angie Rudduck: Samuel wants to know, “Is it best practice to always add checksum when taking backups?”
Brent Ozar: Did you do that when you were a DBA?
Angie Rudduck: I didn’t.
Brent Ozar: You too, both of you, yeah, yeah.
Angie Rudduck: I didn’t know it existed.
Brent Ozar: I don’t think most people do.
Tara Kizer: I knew it existed. Did we do it? Probably not. It does add overhead to the backups and we were—at least a lot of the critical systems we would always, not always, but we would have a backup restore system. So we were testing our backups regardless. So do you need checksum if you are going to be testing your backups?
Brent Ozar: Yeah, I learned about it after I got started consulting. I’m like, oh, that’s an interesting idea. I went on a little quest of “I’m going to get everybody to do checksum on their backups.” I put it in sp_Blitz as a warning, “Hey, you’re not doing checksums on your backups.” Universally, people were like, “What is that? Why would I want my backups to go slower?” So I took it out as a recommendation just because people don’t like their backups going slower.
Tara Kizer: Does Ola’s solution, does it do the checksum by default?
Brent Ozar: Not by default, yeah.
Angie Rudduck: I think it does.
Brent Ozar: Oh, does it?
Angie Rudduck: Because I’ve been playing around. Yesterday I was playing around, let me double check my savings here but I ran the scripts default and then took a look. So would have to double check, but it’s included as an option at the very least.
Brent Ozar: And doesn’t his do verify by default too out of the box?
Angie Rudduck: Yeah, maybe it does verify by default and not checksum by default. But the verify, I mean the one thing I don’t think people think of is like how it can impact because you might be, “Oh, my one gig backup is taking 20 minutes.” I don’t know. But it’s because it’s just doing the restore verify only command against the backup it just took. So it’s just saying, “Oh, is this still a valid backup?” And at the basic level, right? Correct me if I’m wrong, but it’s only saying, “Oh, yes, I can open this as a file. I don’t know its validity inside.” Just that it could reopen it as needed. So that’s just something to be considerate of, it’s not the saving grace. “Oh, I did verify only.”
Brent Ozar: Yeah, it could be all corrupt data in there. It could be 100 percent corrupt. The way you’re going to have to find out is to run CHECKDB.
Why don’t our SELECT queries show current data?
Angie Rudduck: All right.
Brent Ozar: We’ve got all kinds of questions coming in. It’s crazy.
Angie Rudduck: I know, they’re definitely falling in now. Okay, so Andrea says they have been having issues with data not showing up in reports for sometimes up to 15 minutes. They are an OLTP shop running 2012 Web. Is this possibly a thing with SQL or is it due to something else?
Tara Kizer: I think we would need more information as to how is the data getting into this database? Is it queueing? Is there a backlog in say a Tibco queue or something like that? Or, you talk about reporting, do you have a replicated system? Or in availability groups, readable secondary, maybe there’s a delay in getting the data to those. I don’t think we have enough information to answer it.
Angie Rudduck: Yeah, I agree.
Brent Ozar: It’s never normal to do an insert in SQL Server, commit your transaction, and then not have it be available for a select immediately.
Why am I getting tempdb-full errors when my tempdb is 4GB?
Angie Rudduck: Let’s see what David has to say. He’s getting this on a server with four by one gig temp data, eight gig temp log, insufficient space in tempdb to hold row versions. Need to shrink the version store to free up some space in tempdb.
Tara Kizer: That’s a pretty small tempdb. I’ve supported tempdbs that were half a terabyte in size just because we had business users running ridiculous queries. So, first of all, why is your tempdb so small? Are you shrinking it down? You probably need some more space. Version store, are you running read commited snapshot isolation level? So you need more space for tempdb.
Brent Ozar: And then how much space do you need? Generally, if somebody puts a gun to my head and just says go pick a number, I’m going to go with 25 percent of the size of all the databases combined. So if you have say 100 gigs on the database server, you probably need at least 25 gigs for tempdb.
Tara Kizer: A few jobs ago, we set up hundreds and hundreds of servers. So we just made a policy and tempdb we set at 70GBs. These were shared servers with lots of databases and we didn’t know what was going to occur on them. We would have alerts to warn us if data files or the log file was creeping up on, if they were going to fill up, so we could react to those. But 70 GBs for all of the tempdb data files and I believe 30GBs for the tempdb log file. That was just our default.
Brent Ozar: I don’t get out of bed for less than 70 gigs.
Angie Rudduck: Silly, silly.
If I don’t know monitoring tools, will that hold me back in job interviews?
Angie Rudduck: Ronny supports about 25 prod and dev databases as a DBA. He’s not in the corp Windows DBA group and does not have access to all the tools monitoring performance, etc. “All monitoring I have in place is based on scripts that run and report issues. Will the lack of experience working with the tools like that hurt my chances with pursuing a new DBA job? I know it really depends on what the hiring manager is looking for but is knowing tools like that an important skill to have?”
Tara Kizer: I don’t think it’s an important skill necessarily, I think it’s obviously going to depend company to company but if you don’t have any experience with monitoring tools, I think that that’s fine as long as your other experience, your actual SQL Server experience, is what they’re looking for. You can get up to speed on these tools, I wouldn’t say fairly quickly, but you can at least click around and figure things out and with some guidance get some pretty in-depth knowledge of these tools. For the most part, I don’t think that companies are paying for tools like this. So I think that it’s pretty rare that companies have these tools in place.
Angie Rudduck: Yeah, unless you’re going to a large DBA shop, I don’t feel like you’re probably going to have very many of these tools.
Brent Ozar: And you’d have to know all of them. I mean, you know, if you only knew one and then somebody doesn’t use that one, you’re screwed.
Angie Rudduck: It’s not the same thing as not knowing SQL Server versus MySQL versus Oracle. They all run pretty similarly and nobody expects you to know all of them or they’re only going to hire you if you know this one. Like if you only know Redgate, great, because they’re a Redgate shop. That’s usually not the case.
Brent Ozar: Yeah, when we do interviewing for consultants for example, so when we go and hire people, we will often give them a virtual machine and say, “Now you’re on, take remote control of this thing. Show me why it’s slow.” Or, “Troubleshoot why this query isn’t working.” If someone comes to me and says, “Well, I’m sorry, all I can do is use a tool,” like I only ever troubleshoot this with Dell or Idera or Redgate and I’m lost without a third party tool, you’re not going to do well as a consultant because we can’t rely on those tools either. When we parachute in, man, I have no idea what’s going to be happening on the other end. So it pays better to know the native ways to doing things.
Idle chit-chat about smoking weed and your friend sp_WhoIsActive
Angie Rudduck: I think we have probably time for one more question. Did you guys see anyone while I scroll back and look?
Brent Ozar: Greg says he remembers that the tweets about tempdb stuff was four tempdb files per NUMA node. They were smoking weed too. I don’t know who that was.
Angie Rudduck: They must be in Portland.
Brent Ozar: Yeah, Denver, something like that.
Angie Rudduck: Someone chimes in, Eli says, “The sp_WhoIsActive is your friend about the monitoring” to you Ronny. That is a good point, we love…
Tara Kizer: WhoIsActive and the Blitz stuff.
Brent Ozar: Yeah, free tools. Pro tip: If you’re going to apply for work at our company, you may want to try using our tools. Know that they’re out there. If you come in and use someone else’s tools, it will not be a good time for you.
Angie Rudduck: Yeah, I agree.
Brent Ozar: Not that you ever need to know how to use our tools to work here. We teach you those too after you get here. But, yeah.
Is PostgreSQL better than SQL Server?
Angie Rudduck: Yeah. I feel like there was one that I… There’s a couple that are like…
Tara Kizer: Wes asks the same question, I think he wants—they’re 20 minutes apart.
Angie Rudduck: He really wants me to read his question. Wes, I’m going to tell you my answer is SQL Server pays our bills. Wes wants to know what our thoughts are on Postgres versus Microsoft SQL Server. SQL Server.
Tara Kizer: We’re SQL Server professionals so our answer is going to be SQL Server. If you want me to support your system, I don’t do Postgres SQL so I can’t support it. I mean, I could probably learn it but I don’t really have any interest in learning it.
Brent Ozar: See, I don’t support it either. But I always try to learn about other stuff. There’s stuff that’s really cool about Postgres. Unlogged tables is a classic example. If you search for Postgres on our site, we’ve written a couple blogposts about different features in Postgres that we would want in SQL Server. But boy, at the same time, I kind of like parallelism. Man, Microsoft SQL Server has had parallelism for a long, long time. That’s kind of nice in today’s huge, multicore environments where 16 cores isn’t a big deal. 32 cores isn’t a big deal anymore. Parallelism is pretty freaking awesome. And they’re iterating super fast. So, yeah, I kind of like Microsoft SQL Server. If I was going to start a career from scratch, so like Microsoft is where it’s at in the enterprise environment and Postgres is where it’s at in the startup environment. Well, thanks everybody for hanging out with us today and we will see you guys next week.