This week, Brent, Richie, Erik, and Tara discuss partitioning tables, native backups vs 3rd party agent backups, page life expectancy, query tuning, deadlocks, drawbacks of RCSI, triggers, replication, Always On Availability Groups, and forest fires.
You can register to attend next week’s Office Hours, or subscribe to our podcast to listen on the go.
Enjoy the Podcast?
Don’t miss an episode, subscribe via iTunes, Stitcher or RSS.
Leave us a review in iTunes
Office Hours 2017-7-12
At what size do tables benefit from partitioning?
Brent Ozar: Richard says, “Is there a rule of thumb for a number of rows that would make you want to partition a table? I don’t need archiving, I don’t need sliding window loads; I just have a 230 million row table with about 45GB on disk. Is there some rule of thumb when I should consider partitioning?”
Erik Darling: Not if you don’t need those things.
Brent Ozar: What would make you think you need it?
Erik Darling: Needing sliding windows and archiving. Because it’s not going to help for performance, it’s not – it’s going to make things more complicated for you [crosstalk]
Richie Rump: Partition elimination?
Erik Darling: Yeah, if you get it, if you’re real lucky.
Richie Rump: Yeah, well you would have to use that partition key in order to get that.
Erik Darling: Yeah, so if your queries don’t match your partitioning key then it’s no good.
Richie Rump: Yeah you’re screwed, sorry.
Brent Ozar: And it’s becoming worse instead of better. SQL Server has to reassemble those results again; it’s tough. In terms of, is that a large table or not either, I don’t think it’s large for SQL Server anymore. I mean, the rule of thumb for VLDBs was always 1TB or one billion rows in a single table. So in the hundreds of millions, I mean, it feels intimidating, but remember, you’re always going to be working with the biggest servers you ever worked with. Your career’s always going to progress forwards until you have that one accident where you drop tables, drop the database, then you temporarily get fired but then you get another job…
Tara Kizer: I was okay when I did that…
Brent Ozar: Did you blame it on anybody else or?
Tara Kizer: No, no, no I always take – I always raise my hand.
Brent Ozar: What was the first worst memory when you dropped a table or dropped a database?
Tara Kizer: Oh goodness, the first? It would have been about 2002 or
2003 when I’d just started at Qualcomm, that’s the company that had Las Vegas DR site. And we did a planned failover to Las Vegas, they did it one or two times per year and we were production out there for a couple of weeks at least. And this was back in SQL Server 2000 days, so we were using log shipping and after we were done with the failover our process was to now drop log shipping and set it back up in the other direction. So what I would do is I would drop the database at the other site because it’s in a recovery state. So I would just drop and start the research. Now, you don’t have to do it that way, but that’s what I did. And I accidentally dropped the production, the one we had failed over to, because it was now production. So, you know, we were in a maintenance window, so it wasn’t a huge deal, you know. And back then databases weren’t big anyway, but it didn’t take me long to do a restore. But that system, that database had been online after the failover while we were working on all these other issues with other databases, other applications.
Erik Darling: It figures, that’s like the one time that a dropped database works on the first try. [crosstalk] no you can’t do it… What does it matter, SQL’s just like yeah, go ahead.
Brent Ozar: Yeah, it’s the same thing with SSMS letting you click twice slowly on a database name and just rename a database. Some things should be a little harder than that.
Erik Darling: Bad news, everyone has to use tilde now. That’s the first thing I hit instead of hitting escape.
Should I put dev/test/prod all on one VM?
Brent Ozar: Let’s see here. J.H. says, “What are your thoughts on having one SQL Server VM serving as all three dev test in prod, and then just use three databases, one named dev, one named test, one named with prod and just use different logins? What do you think about that?”
Tara Kizer: Oh my god, no. No, no, no, no, no. What are you doing? So this is going to be on the same SQL Server instance – you get a QA person who’s got elevated access in the test environment, and suddenly they brand this massive query and, you know, hogged up all the CPU and memory. I mean, wow, no. [crosstalk]
Erik Darling: Yeah, the size of that box would have to be so gigantic for me to be comfortable with all three environments sharing that it would probably just outdo any gains you got from it.
Brent Ozar: And your licensing would suck then at that point, because you were doing Standard or Enterprise on there, as opposed to Development Edition, which is free these days.
Tara Kizer: J.H. says it’s a VM, so why are they sharing? Don’t you have other servers?
Brent Ozar: He says, as a follow-up, “It’s very small DBs for one specific team”.
Tara Kizer: I don’t care. Prod is prod. Everything else can go on the same box, I don’t care. Put dev, test, stage – well maybe not a load test environment, maybe not stage, the ones that should mimic production. But all those other smaller databases that are not prod, they’re on one same server, I don’t care about that.
Erik Darling: Someone monkeys around with an agent job or something else in there, that’s prod affecting too.
Brent Ozar: Richie, I assume that – Richie does all the development here and manages all the databases – even with us as all database administrators, we use separate dev and test and prod environments, I assume?
Richie Rump: Yes, we sure do, absolutely. [crosstalk] SQL side, we do that.
Should I use Ola’s scripts or Unitrends?
Brent Ozar: Let’s see, next up Doug says – not Doug Lane, but another Doug; although hi Doug Lane, wherever you are. Doug says, “Currently using Ola’s scripts to manage backups for about 20 instances of SQL Server. Our operations manager is pushing something else, Unitrends Enterprise Backup. What are your thoughts on native backups versus somebody else’s third party agent stuff?”
Tara Kizer: I don’t like it. I especially don’t like the fact that – Unitrends Enterprise Backup, I’ve never even heard of that. I mean, a lot of clients aren’t using native backups, maybe they’re using other things… but we’ve at least heard of the technology that they’re using. This is – I’ve never heard of this. I would say no.
Brent Ozar: I think it’s CA. I think it’s Computer Associates, I’m not100% positive. Like it’s one of those mainframe-y companies…
Tara Kizer: Yeah, but if no one else is using it in the SQL Server world, or hardly anyone is and you’re not seeing any information out there about it, I would not go that way.
Brent Ozar: And what are some of the reasons why you don’t like third party backup things that are outside of the DBA instead of native backups?
Tara Kizer: So at one of my jobs, every few years they would ask, can we move to, say, NetBackup for the backups. Don’t do native backups – because before, we were doing native backups and then sweep those to take with NetBackup. But they wanted to get rid of the native backups for SQL Server because the Oracle side was using NetBackup. So we did this whole proof of concept and stuff, and it worked fine, but as a DBA, I don’t want to have to go to some other tool. I’m not even the one managing that software over there, you know.
In the companies I’m at, you know, we’ve got sysadmin teams, we have SQL Server team, you know, everyone’s different. So, I have to go to a sysadmin to tell them what to select for the restore and point in time recovery and all these things. So it just complicates things for a DBA when it comes time to do a restore. And when it’s time to do a restore, that’s a very critical task.
Brent Ozar: yeah, log shipping makes those things less useful to me because I want access to those log backups. If I need to reinitialize a database mirror or Always On availability groups, those things don’t help me very much. So huge fan of – when the folks say, I want to use Unitrends, NetBackups, whatever, I’m like sure, go backup the file share where I write my backups too; you can totally use the agent there.
Why is PLE different across 2 NUMA nodes?
Brent Ozar: Gordon says, “I’m seeing a large difference between page life expectancy on two nodes of a two CPU environment.” He’s got numa, so he sees two different tracks for page life expectancy. “Shouldn’t they be the same?”
Tara Kizer: Gosh, I remember there’s an article about this topic, that if they are not the same then there’s an issue.
Kehayias Paul Randal has one about it; we’ll have to put it in the show notes. But essentially, you can have different workloads happening across the two different nodes, I just wouldn’t focus on PLE and I’d jump out and look at wait stats instead. Because PLE can be in the toilet and your server may not be waiting on memory, you know, may not be waiting on storage, in other words reading data pages from a data file. Your PLE can also be fantastic and yet the server performance can be in the toilet; like you have locking problems or blocking issues. So I just wouldn’t use PLE that much anymore.
Why is a query faster on one AG replica?
Brent Ozar: Let’s see, Matt says something that’s going to be near and dear to Tara. “We are going to move from a single SQL Server to an availability group. The new Ag servers have more memory and CPU than the old server, but when I’m testing the same query on both environments, the AG servers are eight seconds slower. Where would I look to see why a query is faster on one server than another?”
Tara Kizer: Who wants to do it?
Brent Ozar: Erik, you haven’t answered a question in a while.
Erik Darling: There’s lots of stuff to look at. The first place I would look is the execution plan.
Brent Ozar: And what do you look for – like how do you compare to execution plans?
Erik Darling: Me? Oh god, I’m a horrible klutz with – I always just open stuff up in plan explorer. So you could look there – let’s see… I really thought the AG thing was going straight to Tara [crosstalk]
Tara Kizer: I mean, this isn’t an issue with the AG though.
Erik Darling: Alright, alright. So I would look at settings on there, see if maybe MAXDOP or cost threshold is different, but that might show up in the execution plan. Let’s see what else, what else would be good? I don’t know…
Tara Kizer: [crosstalk] waiting on, maybe it’s waiting on something different on this server than on the other server.
Erik Darling: My first instinct is just to check the execution plan…
Tara Kizer: Exactly, this sounds like just a bad execution plan parameter sniffing issue, but you know, parallelism settings, maybe check out the wait stats. Not maybe, do check out the wait stats, but maybe the wait stats is going to show HADR sync commit has a very big waiting time as compared to system up time. And if that’s the case, maybe the two sync commits to the other server. If you’re using a synchronous commit replica from the AG, maybe it’s adding a lot of overhead because you do not have blazing fast storage and blazing fast networking; those are critical for synchronous commit replicas. So wait stats and then what is the query waiting on. Look at sp_whoisactive on top of the execution plan stuff.
Erik Darling: Or sp_BlitzWho, sp_BlitzWho is okay too.
Tara Kizer: I’m just so used to whoisactive, you know. Something so rock solid for so many years and this other tool comes along. You know, I don’t need to switch to another tool. I like BlitzFirst, don’t get me wrong, I just, you know, for current activity I’m going to whoisactive. Sorry.
Erik Darling: No loyalty
Tara Kizer: the loyalty is to the product I’ve been using all these years.
Erik Darling: How much does Adam pay you?
Brent Ozar: Boos… The other thing is, of course – don’t forget the obvious thing is different data across the two different servers too if you haven’t restored that AG server in a while. Also, if you wanted to share the plan with other people, I don’t know how sensitive the query is but you can also use Paste the Plan. If you go to PastethePlan.com, you can paste in your execution plan and then you can give both of those links in a Stack Overflow question, you can post it on SQLTeam, SQL Server Central and say hey, help me figure out what the differences are between these execution plans so you can see which one’s better or worse.
Can I help my developers reduce deadlocks?
Brent Ozar: J.H. asks, “I had a developer recently experience deadlocks, and he asked if anything can be done on my end as a database administrator. I played pin the blame on the developer, but is there anything else I can do in order to make these deadlocks disappear?”
Tara Kizer: Well Brent, last week you mentioned how deadlocks was like your thing that you needed to work on, the thing that you’re weak on, and then you said, you know, as soon as you get to that deadlock graph, you mentioned that’s – you know what to do. I can’t read deadlock graphs.
Brent Ozar: I love – I mean, there’s two parts to it. One is troubleshooting why the deadlock is happening and then the other part is doing something about it. And the place that I would start is, often tables have no indexes, they have a clustered index and that’s it. Imagine the phone book, the white pages of the phone book that grandpa kept on top of the fridge. Well if I told you, go through the white pages and update everyone whose first name is Brent, you’d end up doing lock across that entire phone book, finding all the Brents and doing the updates. Whereas if you had the right non-clustered index on first name, you’d be able to seek directly to the Brents, you’d be able to know how many rows are going to come back. Some of this is a database administrator’s job, assuming that database administrators are the one in the shop fixing the indexes. You could also play around with read committed snapshot isolation.
Now, this isn’t bulletproof. If two people want to lock the same row, you’re still screwed, but if you go to BrentOzar.com/go/RCSI, as in remote controlled Subaru Impreza. If you go to BrenOzar.com/go/RCSI, Kendra Little has a great write up on everything you need to think about, how RCSI makes your database queries faster and what you need to tell your developers in order to implement RCSI. Start with indexes; RCSI is probably the next level up.
Richie Rump: Yeah that’s funny, as a developer, I know very little about deadlocks. I know that when it happens I go straight to the indexes, right, and it typically fixes my problem. And in fact, anything I do new that’s greenfield, RCSI is the first thing that’s turned on. Any new database come on, RCSI – and if there’s a problem that we need to fix and turn off or whatever, which I’ve never had happen to me, then we can handle it there. But it’s usually a good default to turn on RCSI.
Erik Darling: I’d say if you’re in a position where changing the indexes or changing a setting like RCSI isn’t possible, I would have the developers just build in retrial logic, so that if they hit a deadlock they catch that error and retry their transaction. Like wait for a second, wait for half a second and then retry.
Richie Rump: Still not fixing the deadlock…
Erik Darling: No, but at least you’re trying it again so you’re not missing the transaction.
Brent Ozar: Which may be fine. I mean, in some shops it’s totally okay to just retry [inaudible] behind the scenes.
Brent Ozar: Robert says, “Are there any drawbacks of RCSI?” The big classic one is that some yo-yo begins a transaction on Friday, locks his workstation and then goes home for the weekend, and then tempdb explodes in terms of size. So long running transactions is the big ugly one. You can learn more about it though at BrentOzar.com/go/RCSI. Got huge write ups in there.
Have you done cross-database queries in Azure?
Brent Ozar: Wilfred says, “Do you folks have any experience doing cross-database queries in Azure?” I assume he means Azure SQLDB. The answer on that is no because it sucks. You have to predefine your tables on both sides, or on whatever side you’re going to do a linked server query to. Like I have to define what the remote table’s going to look like as an external data source, and that’s just a brittle slow-down for most of my developers. Developers just want to go database name.dbo.table name, and having to predefine the table makes things a little wonky.
Can I redirect queries between AG replicas?
Brent Ozar: Kelly asks, “With a read only secondary in an availability group, is there a way to redirect queries back to the primary when the secondary stops responding?’ You know, her “sandbox environment, some of the mount points for the secondary went down” – not laughing at you, I’m laughing with you – “and my read only connections for my reports went along with it.”
Tara Kizer: it’s the read only routing list, so you have that set up to go back to itself if the other replica is down. So it’s read only routing list. And make sure that the read only routing URL is correct for it.
Erik Darling: Is that available in all availability group versions, or is that…
Tara Kizer: I mean, I don’t know about basic availability groups – well there’s no readable secondary, so no. yes, it’s available since 2012, definitely. There’s no GUI for it, so it’s easy to miss. I missed that when I first set up availability groups four years ago, and read only routing was not working.
Brent Ozar: The part that sucks is, you’ve got to make sure your developers understand application timeouts too. The connection timeouts may need to be set longer, because if they have short timeouts, like 30 seconds, it may take 30 seconds for the connection driver to try the first server that’s down and then go retry the second. It also won’t fix queries that are in flight. Once the query starts, if all of a sudden the mount points disappear under the secondary, that query is toast. You just have to build in some kind of retrial logic, but leave it to the consultants to say, build retrial logic into all your application queries.
Richie Rump: Oh it’s so easy, yeah, let’s go ahead and do that… Just do it, developer.
Erik Darling: I have supported entire applications that were built on retrial logic, so it can’t be that hard.
Richie Rump: It takes time, money and effort, that’s all.
Brent Ozar: And retrofitting it into an existing application is a nightmare; it’s just a nightmare.
Richie Rump: yeah, and typically when you say use retrial logic, that is our scenario, right? It’s not something we’re building new, because the new stuff is probably going to be in the cloud, which already has automatic retries. Thanks, guys.
Any free tools to develop triggers?
Brent Ozar: J.H. says, “Do y’all know of any free tools to help develop and troubleshoot triggers?’
Tara Kizer: No.
Richie Rump: The delete trigger command, I don’t know…
Brent Ozar: there was – I just saw this hit recently, and I’m going to go look to make sure it’s not a joke… Hold on a second [crosstalk] It had the feel to it. Yeah, so if you go to Amazon, Thomas LaRock put together a book recently, an ebook called The Trouble With Database Triggers. I have not read it, as I had to go look and make sure it wasn’t a joke, but it’s – oh, it’s 15 pages. 15 pages, okay, well it might be a starting point then. [crosstalk] 15 pages, that sounds real, okay, that’s legit. But I remember writing my first stored procedure, and I remember being a developer and going to the book store and there was actually a book on stored procedures, and I was like, this is awesome, this is fantastic. It’s actually harder than it looks, so I thought maybe there was going to be one on triggers, but it’s 15 pages.
Erik Darling: Aaron Bertrand though does have a pretty good talk, I believe it’s SQLBits, about triggers that I would recommend. If you’re getting into the business of triggers, give that a watch. It’s about – it’s like an hour or so; not the rest of your life.
Brent Ozar: Somebody said a couple of weeks ago, they were like, you know it’s amazing when you guys go and just open the URL on the webcast so we can go see what you’re talking about. [crosstalk]
Erik Darling: No I mean don’t use the site search on SQLBits.
Brent Ozar: Yeah, no it’s bad. So SQLBits is a conference where they record all their sessions, and it’s totally available for free. If you go all the way down to the bottom, they have previous sessions and previous speakers. So go to previous speakers and then look for Aaron… First on the list, look at that. Mother optimized, mother, parents, father optimize for the SQLBits list…
Erik Darling: they should have just put his name in quotes…
Brent Ozar: Ooh, five ways to write more effective triggers. And then you can download the slide deck or watch the video right there inside the browser.
Erik Darling: I’d watch the video, Aaron’s a handsome feller.
Brent Ozar: Oh does it actually have the camera? It does, wow.
Erik Darling: I think he’s in a kilt too, so it’s like double your pleasure.
Brent Ozar: I would play it now but then we would lose the entire audience and that would be the end of it. Forget watching these people, we’ll just go watch Aaron Bertrand in a kilt.
Do I need to stop replication to patch?
Brent Ozar: Let’s see here, Philip says, “Do I need to stop replication to do a cumulative update on either server? My assumption is that it will pick up and restart and reboot without recreating replication.”
Tara Kizer: The answer is definitely no. you’re free to go ahead and install it and it will pick up again. You should not receive any errors from replication. Usually, the errors for replication have to do with somebody manually modifying data on the subscriber and now, say, a delete isn’t working, so it’s causing replication to be in a broken state. So yes, you’re good to go there without doing anything to replication.
Brent Ozar: Am I the only one who heard Tara say you should never get any errors from replication? [crosstalk]
Tara Kizer: Reboot.
How do maintenance tasks work in Availability Groups?
Brent Ozar: Richard says, “In an availability group, do the maintenance tasks move across with availability groups? Meaning like updating stats and rebuilding indexes.”
Erik Darling: They sure do…
Tara Kizer: Absolutely not. The only thing that moves over when you have a failover of an availability group is those databases. That group has failed over, only those user databases inside the group has failed over. Nothing else, no system database information, so you’re missing jobs, missing linked servers, you’re missing server level prints. You’re missing a bunch of stuff and so it’s your job, as a DBA who has set up availability groups, to understand these and make sure that other server has a copy of those objects. And you could copy those jobs over and have all the jobs check if they’re a primary replica, and if they are then run the jobs step, the rest of the job step. If they’re not, then do no other work.
So your job is to copy those jobs over and make them AG aware by having an IF statement inside the jobs step to check, am I a primary replica. The alternative solution, because people don’t like having jobs, you know – if you’ve modify a job on one server; you’re going to have to remember to modify the job on the other server. So some companies, they’ve chosen a third server to be their job scheduler where their job steps are pointing to the availability group listener name. So, it is already AG aware, it doesn’t matter which node, primary…
Erik Darling: See I interpreted that question a little bit differently. I thought he meant if the effects of the jobs went across from one to the other. That’s why I said sure do, because I was – you’ve been in the spot where you’ve had to turn off index rebuilds altogether with an AG [crosstalk]…
Tara Kizer: My client this week, I’m telling them to do that too. [crosstalk] I mean, they’re having tons of issues with rebuilding indexes and they have an availability group.
What’s the best way to put 6,000 databases in an AG?
Erik Darling: Speaking of tons of issues with an availability group, I see a question from a feller named Jeff.
Brent Ozar: I was holding Jeff’s one until the last, but alright, let’s tackle it. Jeff says – and it sounds like Erik is excited to do this as well. Jeff says, “What is the best way to do Always On availability groups with an instance with 6000-plus databases?”
Erik Darling: With log shipping.
Tara Kizer: Do not go there. Do not use availability groups for this many databases.
Brent Ozar: Talk about the problems with it.
Erik Darling: Which one, I’ll start with thread pool – oh I didn’t know who you were pointing at…
Brent Ozar: No, you were excited for this question…
Erik Darling: You had a non-deterministic point, I couldn’t help it, I didn’t know where that was going. So the main problem that you’re going to run into there is keeping all those databases synced up, because the processor that you would need, to support 6000 threads, I don’t think has been invented yet. And you’re going to run into all sorts of issues with thread pool and keeping all those databases in sync is going to require worker threads to work on that, and then you’re going to have nothing for user queries to run. So, that’s going to be tremendously painful, you’re looking at breaking that out into probably at least six to eight servers before you even have a manageable amount of thread usage.
Brent Ozar: The worker thread count is tied into the number of cores in your processor, but you could also override it. There’s a setting called max worker threads, why wouldn’t you just change that?
Erik Darling: Because when worker threads are constantly context switching like that, you can run into performance issues waiting for a worker thread to pick back up on a query or pick back up on synchronizing a database. So you could lose synchronization process or it could like time out, or you could lose – queries would degrade performance OIs waiting to get back on these treads that are spread all around.
Brent Ozar: To learn more about that, there’s a great post by Warwick Rudd on – it used to be Simple Talk – Always On availability groups, what not to do when adding databases. And he goes in detail into the worker thread exhaustion issue. So you said log shipping instead, so what would drive you towards log shipping and why?
Erik Darling: the fact that we wrote a couple of stored procedures that do log shipping pretty well at scale like that. So, I mean, just for log shipping in general, it would just be that it takes a lot less from CPU usage and from a performance point of view to back up logs than it does to constantly keep data flowing from one database to another. So you could have – even if you just used standard log shipping with log backups, you know, sort of paste, not like every job starting every five minutes, because that might cause you some issues as well, but if you, you know, spackle the jobs so that they kind of start at different times, you could have pretty good luck getting log shipping to work in a reasonable manner.
I mean, 6000 databases is even pushing it with log shipping for me, but AGs is just off the table immediately.
Tara Kizer: I’d use log shipping for disaster recovery and then I’d use a failover cluster instance for the HA portion.
Erik Darling: the old FCI and log shipping…
Tara Kizer: Yeah, but all they’re using the AG is for HA then I would not use log shipping. I’d use log shipping for DR.
Erik Darling: Yeah.
Brent Ozar: We joke around all the time that a lot of times clients will come to us and want advice on high availability in disaster recover, and sometimes we feel bad saying failover clustering and log shipping, because it’s such a good default answer. It just works for so many things, here you go, failover clustering and log shipping, here you go, failover clustering and log shipping… But it works really well. At 6000 databases, there is something you still need to be aware of; that’s startup time. Startup time can take a long time to make all of the databases available. There’s another great post, and I can’t remember who this one’s by and it will take me forever to find it, but it’s the order in which databases start up. It’s based on the database ID. You can easily run into a scenario where the first hundreds of databases are online but the last hundreds are not.
When I have the page up for sp_allnightlog, just because this is what Erik and I have been working on for the last couple of weeks for a client gig, and I have been continuously singing Lionel Richie for like the last month. Just every time I pop open this code…
Richie Rump: So do you owe him royalty rights for naming this all night log?
Brent Ozar: I can only imagine, especially every time I’ve sung it. He’d probably pay me to stop singing it at this point.
Richie Rump: I just still see Chris Tucker and Fifth Element, yeah, I go right there.
Erik Darling: Yeah, that was me as well [crosstalk].
Should I directly query a 3rd party database, or use their API?
Brent Ozar: Samantha says, “Our developers want data from a vendor-supplied app.” So it’s a third party app that they’re not supposed to touch and they get to it through API calls. “In the spirit of collaboration, I let them know we could do availability groups with a read only replica. This way, they won’t have to wait on the vendor to write new APIs, we can just query their database directly. However, they want to write indexes for their own lookups. Should I do transactional replication instead of an availability group? Should I even do this at all?”
Tara Kizer: I like her thought process. I mean, moving those queries over to another server so you’re not impacting your OLTP database, where all the important stuff is occurring. But if they need very specific indexes for their queries that you don’t want on that production database, you can’t use availability groups. As you know, it’s a copy of your database. But then you’re stuck with transactional replication, and I don’t like supporting transactional replication. You support it for so long and the troubleshooting is ridiculous and it takes a long time to set up on a sizable database any time you have to reinitialize it, for whatever reason.
Brent Ozar: Richie, go ahead because I think I know where you’re going and, Richie, I’m with you.
Richie Rump: Okay, so you have this API that sits in front of a database which the vendor manages. Then you’re going to want to get behind that database and write your own queries. So what happens when the new version comes out, they modify the database and now your application goes splat, because you’re reading stuff that’s not there or stuff that’s been added and now your data has changed and you haven’t realized it because they’ve changed the key in the background or something crazy like that.
If you’re willing to jump into that, just know that you’re going to have some maintenance stuff that’s going to happen with a new release, and you’re not going to know what that is because the vendors kept it hidden because it’s in a database and they have to give you the API which they publish. And they’re probably not going to give you a schema, which they publish, which they probably won’t publish to you. So it’s tough. If I had to do this, I probably would put my own API in front of it and manage it that way as opposed to the developers having access directly to the database, and now I have all these queries going around touching the database, as opposed to one central point where all you do is change the API and it’s done.
Erik Darling: Cache it in Mungo, as they say.
Richie Rump: Or elastic search…
Brent Ozar: Instant technical debt.
Fast answers to several questions
Brent Ozar: Then we’ve got like six seconds, I’m going to rip through the answers really fast just so that we can nail them all before we bail out. Let’s see here, Jeff says, “Don’t only active AG replications take worker threads?” That’s true, but say that a secondary goes down or restarts and you’ve been doing index rebuilds over on the primary; immediately all of that data is going to try flow through to the secondary, and immediately the primary falls over.
Brent Ozar: Michael says, “Any gotchas with setting up an on-prem linked server to an Azure SQL database?” Yes, security. Gotcha on that one, that one’s a little tricky.
Brent Ozar: Oh there was another one that was epic. “What about moving those 6000 databases to Azure SQLDB?” Cha-ching, hello. Just remember that you can buy elastic pools of databases and then provision capacity by the pools, but there’s no pool for 6000 databases. So now you’re going to have to start micromanaging across all of those. You may save money on staffing, but you may end up spending a whole lot more money on capacity planning and performance management as well.
Erik darling: That goes the same for RDS as well, just because RDS has a 30 database per server limit. So what’s 6000 divided by 30?
Richie Rump: A lot.
Erik Darling: I’m not a math guy, so I don’t have an answer.
Brent Ozar: 200?
Erik Darling: 2000 maybe? I don’t know.
Brent Ozar: Alright, well thanks everybody for hanging out with us and we will see you guys next week on Office Hours. Adios.
Fun fact about Unitrends. When we bought them at my old job, we had a couple thousand DBs on an instance. They were all, it’ll backup fine with VSS. Reality turned out to be multiple tickets with MS and them rewriting their entire VSS process thanks to it making too many calls and our backups running for days to get one set. We quickly moved back to native. This only required another box for to use as a fileshare target.