Our first two concerns in this series were virtualization licensing and recoverability. It’s not enough to have answers for these questions today, though – we have to think about how long the server will be in place.
The server’s expiration date isn’t dictated by IT. If it was up to us, we’d never run an old, unsupported version of anything – but sometimes the business is quite content doing just that. So how do we plan for this from the start?
We need to get written answers from the business using a simple worksheet. In our Virtualization, Storage, and Hardware video class, students get an Excel spreadsheet to use when interviewing the business users about the project requirements. Here’s one of the tabs:
Start with the first question: how long does this SQL Server need to last? This can be measured in time (like 3 years) or in business growth (like until we quadruple the number of active users).
There are columns for several answers because database servers are rarely one-person projects – people have lots of different opinions, so everybody fills in their answers, and then we come to a group consensus in the last column.
The next several questions ask about how the business, application, and data will change during that time. We’re not looking for exact percentages, but just vague ideas. Are we going to double the number of end users per year? Are we going to start saving files in the database in the next version of our app?
This isn’t just about query performance – remember, we have to be able to do our backups, DBCCs, and restores fast enough to meet the business goals for RPO and RTO. After all, the business isn’t likely to say, “Since we’re growing fast, it’s okay if you lose more data.”
Who Really Answers These Questions?
Every time I teach this class in person, someone raises their hand and asks, “Do business people ever actually have answers for these questions?”
But it opens their eyes to your challenge as a database administrator. How are you supposed to build a perfect server to handle load that the business can’t define?
The less specific the business goals are, the more agile your server design needs to be.
That’s Where Virtualization Shines.
When you’re handed a wildly vague set of business requirements, it’s easy to build a virtual machine sized for today, and then add additional capacity to handle growth of data, users, and app complexity.
But here’s the kicker: you have to actually monitor the growth.
And just buying an off-the-shelf monitoring product ain’t gonna cut it.
Off-the-shelf monitoring products don’t tell you when your number of active users have doubled, when your incoming order rate has quadrupled, or when you’re keeping ten years of sales history instead of two. You have to work with the business to define a list of business metrics that you can monitor via database queries, and then add those metrics to your monitoring software.
For example, when I ran a sales system, I added monitoring queries for the number of orders per day, the number of active salespeople, and the number of orders we kept in the active database. I used my monitoring tool to create line graphs of those numbers every day, and then I sent that report to the stakeholders every month. I could point to the graph and say, “Sales volume has grown by forty percent, and we haven’t archived any data. Without additional investments in hardware, tuning, or staff training, you can expect that performance will degrade accordingly.”
Lines like that open wallets.
But You Gotta Understand Databases, Virtualization, and Business.
This is where I got so frustrated with guidance that says things like, “Set up one data file per virtual core.” If you’ve got a database that’s doubled in size over the past year, and you need to add four additional vCPUs, that guidance means you would have to:
- Add four LUNs on every SQL Server (primary, disaster recovery, reporting, QA, development – because you’re going to refresh those databases from production)
- Add four more identically sized data files (which instantly gives you hot spots on those empty files)
- Rebuild your clustered indexes to even out the load on the old and new data files
- Watch your transaction logs and backup sizes during this time because you’re going to have a huge change rate
But the one-file-per-core never includes guidance like that. They just expect you to “add files” and call it a day, and they ignore the technical and business side effects. Does the business really want to take a huge slowdown impact while all this happens? Especially if they’re in the midst of a growth spurt? Of course not.
Instead, you have to look at the bottleneck that the SQL Server is really facing, and identify the best way to fix that bottleneck with the least impact to the business.
That brings us to the last set of questions on that page of the planning spreadsheet – how will we handle growth when it happens?
If we suddenly acquire our biggest competitor and need to double the data and user load, or if we run a Super Bowl ad, how will we proactively keep the database fast? Are we going to be able to tune code or hire staff – or are we only able to throw hardware at the problem?
Another way to visualize this for management is my Manager’s Guide to Tuning Code – it shows management what your options are, and lets them choose your avenues.
Don’t get me wrong – I love throwing hardware at a problem when it’s the easiest, cheapest, fastest way to solve the pain. But remember that while virtualization makes it easier to throw hardware at a problem quickly, that may not be the easiest/cheapest/fastest solution. And even if you do it, you need to understand the real side effects of tricks like “one data file per volume per core” – because it’s nowhere near easy/cheap/fast in the real world.