This week, Brent, Erik, Tara, and Richie discuss cross-database transactions in AGs, agent job owners, column store indexes, linked servers, migrating SQL Server databases to AWS cloud, synchronous readable secondary, rebuilding indexes, backups using SANs, licensing, Always On Availability Groups on CDC, Azure Cosmo DB, compression, and encryption.
Enjoy the Podcast?
Office Hours Webcast – 2017-6-21
Cross-database transactions work in AGs – why aren’t they supported?
Brent Ozar: Brian asks, “In SQL Server 2016 and below, Microsoft says cross-database transactions aren’t supported with AGs, but I’m testing and it seems to be working, what’s the big deal? Why shouldn’t I do cross database transactions and AGs?”
Tara Kizer: It’s not that you can’t do them. What’s going to happen on the rollback, isn’t that where the issue – there could be an issue on the rollback, so as long as your servers are up and there’s no issue, it’s going to look like it’s working. I would test – do a BEGINTRAN in one of them and do a bit of changes and then restart one of the servers. Do this in a test environment, obviously. Or if it’s on the same server, crossing the same instance and just, you know, take the other database offline and see what happens on rollback.
Erik Darling: What’s the gotcha on distributed transactions, like when you actually use DTC and not just cross database?
Tara Kizer: [crosstalk] with AG 2016 – you know, I just did an AG client yesterday so I have this fresh in my memory, but you need Windows 2016, and our deck says that it’s somewhat supported. DTC support and Windows 2016, sort of…
Brent Ozar: It’s so bad. It’s only supported between two databases on different servers [crosstalk] database transaction, they can’t be in the same AG, they can’t be on the same server. So it’s – if you go to the Books Online page about AGA support for DTC, scroll all the way to the bottom, there’s a beautiful example just in plain English, like a thought experiment that shows you why you’ll lose data.
Erik Darling: I think they fixed that up a little bit for 2017, like it’s fully supported.
Brent Ozar: Yeah, it’s totally supported…
Erik Darling: On the same server and everything.
Brent Ozar: There’s a gotcha, when you set up the AG, you have to define where you want to lose data. If you have extra data in one database, do you want to discard it or how do you want to react to it?
Erik Darling: Oh that gave me the chills, I’m sorry.
Brent Ozar: But at least it’s supported, because how often does that really happen, you know, that you failover mid transaction? I say it sarcastically…
Richie Rump: On purpose?
Brent Ozar: Right, on purpose you would just do a plan failover and it would be pretty graceful, it wouldn’t be a very [crosstalk]…
Should I change my Agent job owners to SA?
Brent Ozar: J.H. says, “is it safe and best practices to switch all my agent job owners to SA?” He said, “I’m worried about someone leaving and having their login disabled.” What do you guys for agent job owners?
Erik Darling: That.
Tara Kizer: Yes, that’s the best practice. I had a client recently where it had created an sp_whoisactive job, so that it would start collecting some data for us, and they saw me adding SA as the job owner. They were like, oh wait a second, SA is disabled. It’s like, it doesn’t matter, still is going to work.
I’m trying to compare 60M rows over linked servers…
Brent Ozar: M.M. says, this is probably going to be good for Richie, M.M. says, “I’m trying to help a developer…” I already love this guy, or woman, who knows? M.M. I can’t tell. “I’m trying to help a developer and myself to improve performance. We have a query that every day it joins a table across network servers using linked servers or open query, and it’s trying to compare 60 million rows, 11 columns on both of them, one to one. What are my best options?”
Tara Kizer: Unsubscribe. You need to use SSIS for that type of thing. Don’t do this type of stuff in T-SQL. SSIS can handle this more efficiently than T-SQL can.
Brent Ozar: Richie, what’s your thoughts too?
Richie Rump: Yeah, my first thought was move everything to one server, but yeah, SSIS would be better, especially with that kind of load, you know, 60 million rows. Yeah, probably better.
Brent Ozar: And if it’s only once a day, if you wanted to move everything onto the same server, you just do a restore. Just backup – take your nightly backups and restore them over onto the other server, because it doesn’t have to be up to the exact moment in time.
Can you explain columnstore indexes?
Brent Ozar: Let’s see here, Michaela – see I remembered your name from last week. Michaela says, “can you guys explain column store indexes?” I’ve got to start reading these questions before I ask them.
Tara Kizer: Erik, that’s on you… And go.
Erik Darling: Well you see, there’s this dictionary… So column store indexes are neat. They are meant for big analytical or data warehouse type workloads. They have excellent compression, and assuming you have all your row groups and you’re not using any data types that disagree with column store indexes, like for example, strings, and you don’t care much about constraints, they’re great. But there are a lot of gotchas with them; probably too many for me to get into here. Also, they’re not good for OLTP type workloads because when you’re doing lots of little inserts, updates and or deletes, you end up with all sorts of bad things. Well, you end up with things that don’t get compressed correctly or fully and it’s just not really a good scene for that.
Is Microsoft getting rid of linked servers?
Brent Ozar: So let’s see, Dorian says, “does Microsoft have any plans to get rid of linked servers?”
Erik Darling: I don’t know how you would.
Tara Kizer: I don’t think you could.
Erik Darling: One could only hope.
Brent Ozar: And cursors and triggers and…
Erik Darling: Everything I don’t like.
Brent Ozar: And I would like a pony. No, yeah so probably not, because as long as anything’s being used out there in the wild, Microsoft’s all about adoption rate. Like they want to get more people using the product and taking stuff away is a sure fire way to get them to stop upgrading.
Erik Darling: And plus, if you take away linked servers, you take away a very easy path that people have to take data from another platform and put it into SQL Server and start paying Microsoft money for licensing.
Brent Ozar: That’s true.
Erik Darling: How else do you get off MySQL?
Brent Ozar: [Laughs] … Well, you make that face, you’re using Postgres right now, so at least better than MySQL [crosstalk].
Erik Darling: Different universes, man.
Will AWS help me migrate into their cloud?
Brent Ozar: J.H. says, “does AWS offer contractors to help us migrate SQL Server databases up into the AWS cloud?”
Erik Darling: They’re called Snowballs.
Brent Ozar: Oh that’s true, talk about what a Snowball is.
Erik Darling: They send you a cooler with a disk drive in it and you move all your data to it, and then you ship it back to Amazon and then Amazon puts it magically up into the cloud. So if you’re dealing with terabytes and terabytes of data, it’s awesome because it’s much safer and quicker to do that. Imagine the feeling of upload failed on like ten terabytes. So it’s a lot safer and easier way to do things. Probably they wipe the disk that they sent you before they give it to anyone else.
Brent Ozar: AWS Snowball, very cool. And do they have consultants? Yeah, sure, absolutely, and there are tons of people out there. That’s not something that we do, we don’t help you with the migration process, although we do help you sketch out should you go to AWS or not, and then should you do RDS or EC2.
What’s the best way to use SSRS with AGs?
Brent Ozar: Steven says, “Thank you so much for running these sessions…” You’re welcome. “I watch every week on Youtube and this is my first time watching live…” Oh, very cool. He says, “what’s the best way to use SSRS with AlwaysOn? We’re moving to AlwaysOn with a synchronous readable secondary. Okay, so let me stop there and ask, well what are your thoughts when you hear synchronous readable secondary?
Tara Kizer: I’m like, yes. I’ve supported, you know, extremely mission critical system that had high-performance requirements and, you know, an extremely busy system and you have to be careful on that synchronous readable secondary, because the load there can impact your writeable side. And we certainly had issues with that, but our reports had to be near real time, and an asynchronous readable secondary could have significant latency, depending upon your transactions.
We didn’t do index rebuilds on this system anymore because of the latency on the synchronous server, the blocking that it would cause there. But a lot of people are doing index rebuilds, and just imagine what the latency would be on the asynchronous readable secondary. So I’m okay with what he’s saying, it’s just that you have to make sure you have the hardware to support it, the indexes to support your queries and the right queries on that synchronous readable secondary. If you don’t, you’re going to cause problems on your primary replica.
Erik Darling: You know, and Tara’s talking about rebuilding indexes, which I totally agree with, that’s a pain in the butt with AGs, but even if you find that after you’ve put stuff over on the secondary and all of a sudden you have this additional reporting workload or a change in reporting workload and you need new indexes, even creating indexes over there, which has the same effect as rebuilding, can be difficult. So words of caution there as well.
Brent Ozar: He has a follow-up part, he says, “would it be best to install SSRS on that secondary node or have another VM with SSRS on it?” And that probably answers your question there. As little workload as you can get on that VM is probably a good idea.
Tara Kizer: Well that and the best practice is not to put SSRS on your SQL Servers anyway. This needs to go on a separate server…
Brent Ozar: And why?
Tara Kizer: Well I’m all about SQL Servers being dedicated to SQL Server. Yes, SSRS is a SQL Server product but it’s really a separate product, I’m talking about the database engine, only supporting software like maybe antivirus software or tape backup software should go on your server but everything else – SSRS can go on a smaller VM. It doesn’t need the capacity that we have for the database engine.
Erik Darling: That’s generally our recommendation for any of the, sort of, tertiary non-engine components, SSIS, SSRS, put them on a different server. You don’t want them competing with your SQL Server, just because those resources are precious and, you know, unless you have things really well defined with how much stuff actually needs. They’re going to walk on each other and it’s not going to be pretty.
If I have two SANs, where should I store my backups?
Brent Ozar: James says, “we have two SANs, we have one SAN just used for data and one SAN is for backups. To make life easier in the event of either SAN failure, where would I store my SQL Server backups? Should I store my SQL Server backups to the data SAN or to the backup SAN where my VM backups are kept?”
Erik Darling: Yes.
Brent Ozar: That’s kind of interesting, yeah…
Erik Darling: I would do both. I would have them in both places.
Brent Ozar: I kind of like where you’re going with that. Well yes, because you don’t want to lose point-in-time recoverability if you lose either SAN, that’s a good point. And your backups aren’t usually that big. I mean, you don’t have to keep 30 days of backups on both SANs.
Does SSRS need a license when run separately?
Brent Ozar: Ben says, “if SSRS is on a separate SQL Server, does that require a separate license?”
Tara Kizer: Sure does. Sorry, best practice.
Brent Ozar: It was cheap – you might be able to get by with Standard too, you may not need Enterprise Edition there.
Erik Darling: And it can be a much smaller VM, you know. Like on your regular one, you might have 16, 24 cores on the SSRS when you might be able to get away with 4, 6, 8.
Have you used 5 AG replicas with CDC?
Brent Ozar: Seybou asks, “do you guys have any experience of using Always On with five replicas with Change Data Capture (CDC) enabled? What are some of the issues that might happen?” Any Of you guys used CDC?
Tara Kizer: I have used CDC but it was not on an availability group server. I can’t really think of issues – well I can think of one issue, CDC uses the transaction log. That’s where everything gets logged, so you get CDC reading the transaction log and you’ve got availability groups reading the transaction log on what to send out to the other servers. So I know when you have replication and database mirroring, those could have contention because everything’s reading the transaction log, so there’s trace flags in place that can help you to have – I think it’s to have replication skip over the mirroring records and things like that. I wonder if there’s something for CDC that would allow it to – you know, so it doesn’t have to scan everything in the transaction log, because I’ve had issues with that in the past. I have had supported systems with five replicas, but not with CDC enabled.
Brent Ozar: Gotcha, Seybou, if you want more info in there too, you might want to let us know what you’re going to use those five replicas for, what you need all five for, and maybe what you’re using CDC for as well.
What is Microsoft CosmosDB?
Brent Ozar: Folks, we’re down to a couple of questions left in the queue. If you have questions, feel free to put them in over in the questions pane and GoToWebinar. Alexandre asks, “guys, what is Microsoft Cosmo DB?”
Erik Darling: It’s what Document DB used to be.
Brent Ozar: And go on about – so was it just a rename or?
Erik Darling: I’ve read nothing about it. I know that it used to be Document DB and it’s a NoSQL document store with lots of JSON involved and that’s where I bail out. That’s where I pull the parachute whether I’m jumping or not.
Richie Rump: Yeah, didn’t they add some other types to that, as opposed to Document, there’s key value pairs and stuff like that?
Brent Ozar: Yeah, they had Document DB for the longest time and it wasn’t catching on. And I looked at it, we looked at it I think, using it briefly for execution plans, but the maximum document size was like 400KB. So it was like a key value store originally, then they just kept layering more capabilities on top of it. And now with the new Cosmo DB, they renamed it. They clicked file save as in visual studio and they saved it under a different name. Now Cosmo DB has all kinds of other things it can store instead of just 400KB key value pairs, but in terms of, like, us knowing any more than that, no, because stuff – like all the NoSQL databases, you generally have to write your frontend app differently or your backend apps differently to access data in different ways. It’s not like they’re just capable accepting inserts, updates and deletes, you have some different work to do, so it’s not lightweight.
Can I compress backups of varbinary(max) data?
Brent Ozar: James says, “our beautiful and attractive developers store attachments in the database, so I have a lot of columns that are varbinary max. Is it possible to compress my backups?” Well so generally, if it’s binary files they’re probably not all that compressible to begin with, like if it’s Excel or Powerpoint.
Erik Darling: Yeah I’m just thinking about, like, the way that gobbledygook compresses when you use TDE, and I don’t think that does too well either. I wonder if even, like, page compression would work well on those at all?
Brent Ozar: If it’s off-row data you’re screwed. Yeah, as soon as it goes off-row, you’re out of luck. What I would do is, if you have anything like – I always blast dedupe appliances. I’m like dedupe appliances are horrible for SQL Server backups, but this is kind of what they’re designed for. They use additional CPU power in order to do like single instance storage. It might actually help you out there. But the other thing you could try is you could just try zipping or RAWRing your backups, just the files. See if they compress at all, if they don’t you’re screwed.
Erik Darling: Yeah, we have a 7zip script. What I would focus on, rather than that, is trying to talk my developers out of doing that. Just storing pointers to the files on the file system, getting rid of the blob files in the database and just, you know, pointing to wherever they live and letting SQL find them – or letting the application fins them that way, just doing the lookups through SQL.
Richie Rump: If the developers are beautiful, does that mean they took beauty over brawn… Brains.
Brent Ozar: He actually said they were stupid, but I didn’t want to say…
Tara Kizer: I was going to say. I read the question.
Richie Rump: Oh there it is, yes, stupid developers. Well I can’t argue with that one.
Brent Ozar: Man, … there’s my little soapbox is – you know, we always think – I say we, as database administrators, a lot of us have this inflated ego, like we know everything and those other people in the shop are stupid. And then we run sp_Blizt on your servers and I go, you don’t have any pants on, you know, just stop calling other people stupid, you don’t have backups set up.
What backups can I take on an AG replica?
Brent Ozar: Brain says, “For availability groups, can you take differentials and log backups on my read – which backups can I take, fulls, differentials or logs on my secondary replicas?”
Tara Kizer: No differentials and only full copy only, but you can take log backups there. I don’t recommend taking backups on secondaries, honestly, because they could be out of date. Even a synchronous commit replica can be out of date. SQL Server could drop it to an asynchronous replica silently because of some timeout issues with syncing the data over there. So I’ve only ran all my backups on the primary replica, and if you’re trying to offload that kind of task, I wonder if maybe you’ve got some I/O issues, you know. Why are you having to offload a backup task? Since when did backups cause issues?
Erik Darling: Tara, you said one thing in there very causally but it’s something that no one – it’s something that I forget a lot about AGs, is that behind the scenes, they could get dropped to async if SQL Server detects enough latency. And that’s something [crosstalk] insane.
Tara Kizer: I actually didn’t even know that, you know, until I saw the blog post from, I think it was Brent, you know. The quote from MSDN is like oh, I wonder if that actually ever happened to me. I’ve no idea because it’s doing it silently, you don’t know, and you know, I’ve never had an issue with failovers. Obviously a failover is not going to occur if it is an asynchronous replica. It’s going to say it can’t do it right now, or whatever it’s going to say. So I don’t even know if it’s every occurred on the systems I’ve supported.
Erik Darling: That is not a bolded bullet point in the Microsoft pamphlet about availability groups. That is not up front.
Brent Ozar: [crosstalk] I should totally do a blog post of my favorite little hidden nuggets in Books Online, just gems. Because the one like the async index DMVs, column order is not a part of our recommendations, we’re just giving you a CSV list of fields, and everybody assumes that when they see the missing index creation in an execution plan or in the DMVs, they’re like oh this must be the order, right, they’re comma delimited.
Tara Kizer: Did that issue exist for database mirroring, for synchronous database mirroring?
Brent Ozar: I don’t remember if it was in Books Online or not, because I hardly ever use synchronous database mirroring. It’s one of those, looked really cool but…
Is full text indexing a performance drain?
Brent Ozar: M.M. asks, “some of our servers have full text indexing turned on and other ones don’t. The architecture was set up by an accidental DBA; could this be affecting performance? How do I know if full text index is being used, or can I just turn it off?”
Tara Kizer: Turn it off and wait for the problems. I’ve had it installed on most of my servers, you know, on failover cluster instances where it might be hard to add later. It doesn’t cause any performance issues, you know. When I’ve had performance issues on a cluster, I’m looking to maybe stop some services and I have had issues where that service was failing, that cluster resource was failing. And I didn’t want to bother with trouble shooting, so I just deleted the resource, you know, because I didn’t need it. I’ve never had it cause performance issues where it wasn’t being used.
Erik Darling: Yeah, I’m trying to think of ways you can figure it out, like maybe if you looked at the last crawl date – like if you look at some of the DMVs for full text indexes, you could see the last crawl date and the last crawl completed date. You might want to look through your plan cache to see if there are any queries that have contains or contains table. And I’ll make a deal with you, if you can send me a query that has XML that uses full text, I’ll see if I can get something into BlitzCache that detects when full text indexes get used. I can’t guarantee it, because I don’t know if it’s going to be in the cache plan versus the actual plan, but I’ll see if I can do something with it.
Is there a future in SQL Server health checks?
Brent Ozar: Dorian says, “I saw an ad from somebody where companies can pay to have a DBA look at their databases. Do you see this as becoming the norm?” Well given that that’s what we do for a living, yes.
Erik Darling: No, we should be the only ones doing it [crosstalk] huge outlier.
Brent Ozar: We got in – it’s funny, several of us on here have been through – I originally got started with, it was Microsoft’s RAP, they have this – I think it’s called a risk assessment program, or risk assessment something or other.
Tara Kizer: Something, yes.
Brent Ozar: But it was – I went through that process and I was like, this is amazing, you learn so much. They give you like a 400 page PDF of everything about the database. And I was like, well I really just want the TLDR, I just want to too long didn’t read. Give me the five things I need to do or whatever, but it seems to be working out well; we get clients. Knock on wood, I should touch wood.
More about that 5-replica AG…
Brent Ozar: Seybou on his five node availability group, he clarifies back in, he says he’s, “got two nodes for high availability, two nodes for disaster recovery in another location, and then a third node for reporting that’s all set up in with the primary [crosstalk]
Tara Kizer: Same exact configuration I had when I first deployed availability groups. We actually had two synchronous replicas, secondary replicas on SQL 2012. One was for reporting, because it needed near real-time reporting, and the other one was the failover node, and then two asynchronous servers out in DR. We actually had three out there, because we needed exact hardware that we did at the primary site in case we had to do a failover to that environment, we had to run full power. But on 2012 you could only have four secondary replicas, so we had a cold standby. Microsoft wanted us to log ship to it, but I was like – once we did the manual failover, that’s when I’ll pull that one into the AG, yeah.
How can we encrypt data on Standard Edition?
Brent Ozar: Let’s see, Steven says, “if we don’t have transparent data encryption because we’re only on Standard Edition, what’s the best way to encrypt our data? Should I consider BitLocker?” It’s going to kill me – Netlib… So there’s a company called Netlib that offers a filter driver that sits in between SQL Server and the file storage, and they’ll encrypt the data. It’s way cheaper than Enterprise Edition. I’ve never used it. I did a webcast for them years ago and I was like looking at the product and I was like, this is kind of amazing. But I’ve heard good things about it since, I just haven’t used it myself. So Netlib…
Tara Kizer: That’s interesting that they picked that for their company name, because that’s the net library. That’s how you connect to the database drivers, I believe. Netlib was the GUI to them, I think.
Brent Ozar: It was an odd name [crosstalk]
Richie Rump: Nuclear option, have the app encrypt it.
Brent Ozar: And so talk about what the advantages of that are.
Richie Rump: So the advantages of that is, is that from the app and all the way to the database, the data will be encrypted. So that’s the big thing; and then if you don’t want anybody who has database access to be able to get to that data, it will be encrypted in the data. The problem is that if you want to do a SQL query on that, it’s kind of encrypted too.
Brent Ozar: Linked server queries, replication, etcetera…
Erik Darling: I would just avoid BitLocker, because that only encrypts the drive and if you move data off the drive, it’s not encrypted. And if you have BitLocker on other drives, like I have it on my laptop, and every time I boot my laptop up, I have to unlock the D drive. If you have a server go down and come back up, you have to have someone unlock that drive or it’s not really all that bit locked. If it’s just available when you start up, it’s not as useful.
Can you provide reference links?
Brent Ozar, Srijith says, “can you please provide a reference link for an AG secondary backup can cause an issue due to loss of connectivity?” I don’t know if we can do a reference link or whatever; I would just say kind of think through it. If the primary stops sending data to the secondary, for example, if the secondary restarts or gets behind, you can backup that thing every minute, but if it doesn’t have fresh data and your primary goes down, I hope that resume’s up to date.
Erik Darling: even just a network hiccup could put things behind a little bit.
Can my SAN encrypt my data?
Brent Ozar: Ben says – and I hear this about the cloud too. Ben says, “if you’re on a SAN, don’t most SANs have an option for encrypted storage?” oh man, so this is kind of tricky. So a lot of SANs have the opportunity – or, like Azure calls it SSE, self signed encryption stuff, where they go, scouts honor, promise, everything’s encrypted and we can’t access it. You don’t have to set up a key; we just do it for you automatically. And that’s true; right up until you do something like take a backup. You can take a backup of your database to another location, it doesn’t matter if it’s encrypted or rest, you’re still kind of screwed there. Same thing with querying the data, like Richie pointed out. If you’re worried about the DBA selling your data to the Russians, you’re still kind of out of luck there.
Richie Rump: [crosstalk] very much, yes.
Erik Darling: That was a full sentence.
Brent Ozar: There are a lot of Russians in South – I keep wanting to say North Cuba or South Miami, right. There’s a Russian contingency there.
Richie Rump: Yeah, they’re like a North Miami Beach, yeah, so…
Can high performance power mode cause problems?
Brent Ozar: M.M. asks, “somehow all our power plans reverted to balanced.” I’ve seen this with group policy, group policy’s done this. “Have you ever seen an instance where going from basic to high performance has caused a problem?” That’s a good question; I don’t think so. Not that I can think of. I’ve seen it going the other way, you know, going from high performance to basic has caused a problem.
Tara Kizer: He should be able to just do it live, don’t even have to reboot the server. The only comment, you need to look at your BIOS and if this is a VM environment, also look at the BIOS level setting.
Would you rather…
Brent Ozar: And then the last question we’ll take for today. Steven asks, “would you rather be chased by 1000 duck sized horses or one horse sized duck?” It’s a tongue twister even.
Erik Darling: I’ll take the one duck.
Richie Rump: yeah, I’ll take the one.
Brent Ozar: I don’t know, see I think I would die of laughter if I was being chased by 1000 duck sixed horses. I would – I’d just lay down and go see what happens, until they start peeing on me, that would probably…
Tara Kizer: If I was being chased by one horse sized duck, I would wonder if I’m in one of those movies from a few decades ago where it’s like these giant insects are coming through the city.
Brent Ozar: Honey I Shrunk The Kids…
Tara Kizer: I’d wonder if I was dreaming.
Erik Darling: You know what the problem with the little horses is? They’d be too cute. Remember there was that commercial? I forget what it was for, but the guy had a miniature pet giraffe, and at the end of the commercial – it was like a CGI pet giraffe and he would like give it a little kiss and giggle. I forget what that was, but I saw it and I was like holy cow, if you could genetically engineer me a miniature giraffe, that would be my pet. It would sit right here on my desk with a little pillow and hang out, I would be [crosstalk]. So when I think of miniature horses, I think about that giraffe, and I’m like, that would be adorable.
Richie Rump: Yeah, but think about it, horses are still going to be pretty fast, even if they’re miniature, right, but a duck, that’s going to be whiling around pretty slow. I could probably out-maneuver a horse sized duck as opposed to these thousand tiny horses, almost like fast moving zombies on me. Yeah, I’m not about that.
Brent Ozar: That’s true; I buy that. And with that, ladies and gentlemen, we will end this edition of Office Hours, thanks, everybody.