David Stein (Blog – Twitter) wrote a post called “Pop Quiz Hotshot” about starting your disaster recovery plan *now*. It’s a great read with good points that everyone needs to act on, but the comments indicate that not everybody’s seeing eye to eye. As usual, I rely on similes because I’m lazy.
How to Lose Weight
Get your pencils ready, because I’m about to give you the ultimate weight loss tip. It’s going to sound almost too good to be true because it’s so darned easy, and here’s the craziest part – it can actually save you money! That’s right – this is the tip that the exercise equipment industry, the personal trainer industry, and the vitamin industry are desperately trying to keep under wraps. You ready? Here it comes. Don’t blink – you might miss it.
Eat less. A lot less.
I know – it sounds ridiculous, but that wild technique helped me drop 40 pounds in under a year. I didn’t exercise one bit, either.
I know what you’re thinking – you’re thinking, “But Brent, how can I possibly lose weight without spending money? Don’t I have to spend a fortune on the FlabBlaster 3000 just like Chuck Norris tells me to?” Far be it from me to disagree with Chuck – very far – but…
Buying Hardware Doesn’t Fix Bad Practices
No matter how much you spend on exercise equipment or systems management, you and your servers aren’t going to get healthier when the stuff sits on the shelf.
Money can’t buy you health.
It can pay for experts to come in and fix you when you’re sick, but it can’t keep you healthy – that part is up to you. The very first step to getting healthier, and this goes for both your servers and your thighs, is to change your habits. Elbow grease has amazing results when applied liberally.
Start Testing Your Restores. Now.
If you don’t have a spare server lying around, use somebody’s spare desktop. We all have ancient machines sitting in closets from our last upgrade or that employee who just got fired because he couldn’t restore a dropped table. (Get it? That’s a hint.) Get that machine, and throw in a 1.5 terabyte drive for about $100. Yes, use your own money if the company won’t pay, because this is an investment in your career. If you’ve got several machines lying around, consider combining their memory if possible, but don’t sweat it – this is only your training wheels system.
Install the OS again from scratch, and put SQL Server Developer Edition on there or the 180-day evaluation version of SQL Server 2005 or 2008. Don’t overcomplicate your life by trying to get every best practice ideally perfect – even if everything’s installed on a 1.5tb C drive, this system will still work for the basics.
Start by testing your restores once per week. The first few times you do it, don’t try to script the whole thing out – just use SQL Server Management Studio and point/click your way through it. Remember, high bang, low buck/effort: we want this whole thing to take less than two hours per week of your time, max. The restores aren’t going to be fast, but the point is to even figure out if we can do them, period. After a couple of weeks, you’ll start scripting your work as you find more and more things that aren’t included in your test system – logins, DTS packages, jobs, whatever. Document what you’re doing along the way, because every time you find something else that has to be done to make the server work, that’s one less lesson you’d need to learn under the gun.
One Month Later: Add the Apps
After you’re comfortable restoring the database, try to configure your application. Install IIS, DLLs, code, whatever else you might need to get the app to run. If you don’t manage the app, ask the app guys to take another old desktop and try to do their part to set up a restore testbed for themselves. If they don’t want to, that’s okay – but now you’re starting to build up some cover for your rear end.
Some things might not work in your environment. For example, if you’re using the evil xp_cmdshell, your developers may have hard-coded paths and files into their code. The faster you find things like this, the faster you can get them fixed before disaster strikes. When disaster strikes, these problems won’t be seen as developer mistakes – you’ll get blamed, because you can’t make the server work the way it used to. 99% of your problems won’t stem from hardware that you can buy with a check, though – they’ll stem from practices. Stop waiting for the company to buy you a Thighmaster, and start doing pushups. It’s better than nothing, and when disaster strikes, the last thing you want to have is nothing.
When there’s enough basic plumbing in place that you think everything works, format the box and start over. Use your documentation and try to repeat the whole process. The first several times you do this, you’re going to continue to find more errors and gotchas.
When you think your documentation is complete, format the box and hand the documentation to your junior person or your manager. Say, “I’ve got a set of steps to follow when disaster strikes, and I want you to test them for me, because if I’m not around then you’ll be the one doing it.” They will be shocked, but down the road they’ll appreciate your due diligence.
This kind of disciplined effort is why experienced DBAs walk around with an air of confidence. The best DBAs aren’t worried about what happens when disaster strikes, because they’ve already practiced it again and again and again. When I was a DBA, I liked to say that disaster struck every week for me – it just struck in my lab.