Pop quiz: should you be worried if your SQL Server’s page life expectancy is averaging 214?
There’s only one correct answer: it depends. Successful performance tuning boils down to a simple cycle:
- Measure the application’s performance
- Find the current bottleneck
- Improve that bottleneck so that it’s not the bottleneck anymore
- Measure to find out how much your application performance improved
- Ask the application owner if it’s good enough now. If so, move on to the next application. If not, go back to step 2.
And every one of those steps is equally important.
If You Don’t Find the Right Bottleneck
Performance tuning isn’t about zooming in and focusing on a single number in incredible detail; rather, it’s about stepping away and getting the full picture. Time and again, I get emails asking about whether a single metric is OK, but upon questioning, the DBA has leapt to a conclusion without surveying the environment as a whole. If you spend your tuning time closely examining a single metric, trying to figure out how to improve that one metric, you might not improve the performance of your application.
Sure, your page life expectancy might be pretty bad – but is that the one thing keeping your application from performing faster?
Take a step back and gather a complete set of Perfmon metrics. Look at CPU, memory, disk, and network performance as a whole. Find the thing that’s in the absolute worst shape possible. In that link, I explain the order that I look at metrics to find which one looks like the most likely bottleneck.
If You Don’t Focus on Improving the Bottleneck
I was recently working with a client frustrated with their application performance. I found two issues:
- CPU-intensive user-defined functions were being called thousands of times per query
- The storage subsystem was nowhere near as fast in practice as the vendor had claimed
The application’s bottleneck was the CPU-intensive UDFs. The server was frequently pegged at 100% CPU, and queries just couldn’t run any faster until they were rewritten to rip out the UDFs. I put together a recommended plan of action to take those UDFs out, which would make the application an order of magnitude faster. I noted that they should probably start working on the storage performance in a second track, because the instant the UDFs were removed, storage was going to become a problem. With the CPU-burning UDFs out of the way, the server would be able to churn through more records faster, but the storage subsystem wouldn’t be able to deliver records fast enough to satisfy the users.
On our next status update call, they said they’d reworked the storage subsystem. SQLIO reported dramatically faster storage throughput, but they were only seeing a minor improvement in application performance. I had to break the bad news to them that they’d focused on the wrong problem first. After we revisited my report together, they pursued the UDFs with renewed vigor, and suddenly the application was blazing fast. Thankfully I’d documented my findings in writing, but if I’d have been an internal employee, I might have communicated that in verbal form instead. I might have lost the ensuing battle to fix the UDFs because the manager would have thought my advice was bogus.
If You Don’t Measure Your Improvements
All DBAs are consultants.
Some of us think we’re full time company employees, but in reality, we’re delivering a service. Whether they’re developers, project managers, end users, or other DBAs, they’re looking at you just as if you were an outsider. You’re expected to stride in, identify the problem, mitigate it, and show that your work delivered a return on investment. The investment, in case you’re not following, is your paycheck.
Don’t believe me? Poke around in an application and then throw up your hands, saying you can’t find a way to make it go faster. The next phone call the project manager makes will be to an external consultant, and the project manager probably won’t call you first next time. (Sometimes, that’s not a bad thing.)
If you put in a lot of hard work to make an application go faster, but you don’t measure the before-and-after effects of your work, someone else is going to take credit. The developers will say the improvements were due to a new version of their code, because they’re working on the code at the same time you’re working on the database. The sysadmins will say they defragged the muffler bearings. The SAN guy will say he tweaked the flux capacitor. The project manager will say he made everybody worked long hours and that did the trick. The only way, the ONLY way that the DBA can ever take credit is to take clear before-and-after measurements for proof. Run the same code base before & after your tweaks, and measure application performance. Follow up with a written report, even if it’s a one-paragraph email, summing up your changes and the performance improvements, and copy your manager on it.
If You Don’t Ask the App Owner If It’s Good Enough
Never confuse what YOU think about a metric with what the BUSINESS thinks about a metric. Your CEO doesn’t care about page life expectancy. (Your CEO probably doesn’t even care about the DBA’s life expectancy.) Before you spend time or money improving an application, you have to find out whether it’s the most important thing to your business right now.
Say hello to the most important metric you will ever calculate: opportunity cost.
Opportunity cost is the cost of doing something as compared to the cost of doing something else. If you spend eight hours today improving the page life expectancy of a particular server, is that worth more to the business than anything else you could be doing in those eight hours? Could you spend eight hours doing something more valuable?
I use opportunity cost whenever anyone asks me to do something.
As an employee, if a project manager asks me to tune a particular application, I bring them into my manager’s office and say, “It will take me three days to make that application faster. I’ll probably make it an order of magnitude faster, because I’ve never tuned that server before. However, if I take those three days to do it, I won’t make the deadline for Project Snazzywidget. Which one is worth more to you?” At that point, it’s a political decision and a business decision, not a technical decision. If you’re doing the best job of any employee he has, your manager will put you on the most valuable project – which in turn increases your worth again.
As a consultant, I approach the problem differently: “Here’s the thing – I could spend another three days working on this application, but from this point forward, I’m only going to be able to achieve incremental improvements, not the order-of-magnitude improvements we saw in the first few rounds of tuning. I hate to make you guys go through that for a small gain – but is there another application in-house that isn’t delivering the performance you want?” This resets the client’s expectations, and they start seeing you as a weapon that they can point at slow applications. They’ll cherish your time and focus you where your effort will pay off the most. This keeps your perceived ROI high. If you deliver jawdropping results each time you tune an application, you can justify higher billable rates.
After all – isn’t your bank balance the one metric you really want to improve?