Your databases are hosted in the cloud – either in VMs or in a database-as-a-service – and you’re having a performance emergency that’s lasted for more than a day. Queries are slow, customers are getting frustrated, and you’re just not able to get a fix in quickly.
Just ask management if they want to throw hardware at it.
Here’s what your email should look like:
Right now, everyone’s unhappy with performance, and I’m in the middle of researching the issue. I’m not going to have a definitive answer – let alone a solution – for at least a few days.
In the meantime, do you want to spend $1,500 per week to temporarily increase performance?
If so, I can change our VM from an r5.4xlarge (16 cores, 128GB RAM) to an r5.24xlarge (96 cores, 768GB RAM.) Our VM costs will go from $293/week to $1,757/week. Those costs don’t include licensing because I’m not familiar with how we’re licensing this VM – that would be a separate discussion for Amy in Engineering, who manages the licensing.
The email is short and to the point because:
It ignores blame – I don’t care whose fault it is, whether it’s my own fault for being unable to fix it faster, or a growing customer base, or bad code, or maybe the database server should have been larger in the first place.
It doesn’t include an end date – maybe I’m going to be able to solve it in 3 days, but maybe it’ll take a month. This open-ended temporary solution buys us breathing room to do a really good job on the fix rather than duct taping something crappy.
It doesn’t quibble over VM sizes – because during an unforeseen, unsolved performance emergency, you simply don’t know which exact VM size is the best bang for the buck. Just go with the largest size possible, end of story. If someone wants to quibble over sizes, then they’re welcome to do the research to figure out what the right size is. You are not the person to do that research – you need to keep your head down solving the problem. If you don’t even know the root cause and the software fix yet, then you certainly can’t make an exact determination on which VM size will give you the right fix.
It translates technical debt into real dollars – because sometimes the business makes the conscious decision to ship less-than-optimal code in order to bring in revenue. Sometimes, they also want the ability to continue shipping that less-than-optimal code, and they’re willing to spend money to keep bringing in that revenue. That might frustrate the anal-retentive perfectionist in me that wants every line of code to be flawless, but the reality is that sometimes we just need to keep shipping and taking customer money.
It gives the business a choice – and choices help take heat off you. When a manager is screaming at me that we have to fix things faster, I love being able to say, “It sounds like this is really important to you, and you want to move quickly. Here’s a way you can make that happen right away. Oh, what’s that you say? You’re not really that interested? I see. Well, in that case, I’m not going to work 9AM-9PM 6 days per week to fix this, because I have a family at home.”
When your company chose to migrate to the cloud, I bet one of the reasons was to gain flexibility. Let that be your friend.