When an application is offline or performing so badly that the users are complaining, what do you turn to?
When your cellphone says a major incident is ongoing and you can’t get to a computer, where do you send the team at the office?
When you’re putting out fires with whatever you can find, how do you record what steps you’ve taken and what information you’ve gathered?
You’re Only As Strong As Your Team
One major reason to put together a troubleshooting checklist is simple: you can’t always be working.
If you’re the only person who knows how to use SQL Server in your office, you need to train someone. They don’t need to be an expert, they just need to be able to run through your checklist successfully and gather information. Find someone sensible, practical, and with a steady hand who you can practice the steps with, and tailor the checklist so it makes sense to your secondary.
If you work with a large team of SQL Server experts, you need a troubleshooting checklist just as much. As we all gain experience we all develop different ways of doing things. We each focus on different things and may interpret things differently.
The troubleshooting checklist gives your team consistency: it gives everyone a base process to make sure the major areas are covered.
You’re Only As Smart As Your Documentation
The other major reason you want a troubleshooting checklist is that all the rules change when things get really bad.
When a company is losing money, it’s hard to do things in a logical fashion. You’ve got four people at your desk asking all manner of questions from “we just bounced the Heffalamps services, is it OK now?” to “are the Wuzzles impacted by this problem?” Your boss’ boss keeps poking their head around your monitor saying, “I’m just checking in.” You’ve got 500 emails from your monitoring.
When this happens, you remember about 75% of all the basic stuff to check. It’s incredibly easy to miss some obvious things, though. After all, you’ve needed to go for the bathroom for about forty five minutes and still can’t leave your desk.
The troubleshooting checklist helps here in three ways:
- You always hit a consistent list of things that are important to your business.
- If someone misses a step when following up on a problem, you have a place to add a step to ensure the mistake doesn’t happen again. The troubleshooting checklist gives you a way to correct human error— that’s the secret to de-personalizing an embarrassing mistake and instead showing you’re a professional who follows up on errors and is in control of their process.
- You have a place to record information which others can read. This helps you clear out that crowd from behind your desk! Save the checklist in a place where they can read it, and let them know they can see updates on your progress if they let you get going.
To Do: Give Our Checklist To Your Manager
There’s one person who really wants you to have a troubleshooting checklist, but they think of it in slightly different terms. They think of this as an ‘Incident Management Response Process’.
Your manager would love to have predictable response to problems. This helps them understand and explain to others what value you add to the company. It also helps them understand and justify having someone who can be your backup when you’re not available. It helps your manager show that you’re working to have an organized production environment with defined processes for keeping applications available.
Here’s how to handle this. Download our SQL Server Troubleshooting Checklist and give it a read. Think about a couple of things you’d customize for your environment.
Then show the checklist to your manager and say, “I think having a basic process like this would be helpful for our team. I’d like to lead a project customizing it for our applications. What do you think?”
According to Penelope Trunk, the path to promotion is shortest in December.
Now there’s a holiday score.