David Friend, the CEO of Carbonite, left a comment on my blog entry about Carbonite. I’d like to applaud his efforts for taking the time to do that, but the comment raised some ugly questions. It appears that they were putting data on 15-drive RAID 5 arrays. RAID 5 is the most cost effective array setup (other than RAID 0, which offers no data protection).
RAID 5 will only tolerate a single drive failure – if you lose more than one drive in an array, the whole array is gone and must be restored from backup. As SATA drives grow larger and larger in capacity, they take longer and longer to rebuild when one goes bad, because so much data must be copied over to the new drive. If a second drive fails while the first one is being rebuilt, you’re completely out of luck, and must restore from backup. The more drives you add in a RAID 5 array, the riskier it gets, because it’s more likely that one of the drives will fail in the time span of the rebuild. That’s pretty dangerous for a company that makes its living off your backups being available.
Worse, it isn’t clear from the interviews I’ve seen that Carbonite actually backed up your data somewhere other than those RAID 5 arrays. It appears that their attitude towards backup was that YOUR machine held THEIR backup: when they ran into problems with their RAID arrays, Friend commented that:
“Carbonite automatically restarted all 7,500 backups…”
Meaning, they started getting data again from their clients’ machines, not restored them from tapes or other arrays. This is further evidenced by interviews that the Enterprise Storage Group conducted with Friend last year before the lawsuit came out, as blogged by Steve Duplessie of the Enterprise Storage Group. When asked how Carbonite protected their backups, Friend replied that:
“This is backup – not archiving, so if that ever happened you’d still have your data on your PC.”
Like Bryan Oliver says, the only reason we do backups is to do restores. If the answer to restore problems is to use your live server, that’s a failure. Carbonite didn’t protect against regional internet outages, either:
“Regional internet outages (we use multiple redundant carriers) would take us offline if they all failed. But again, unless you were in the middle of a restore when it happened, you’d probably never notice.”
Translation: if you aren’t using our services, you’ll never notice when we’re down. How about replication from one site to another?
“…we don’t replicate data across multiple sites. The likelihood of losing data because of software bugs or human error is probably orders of magnitude greater.”
Errr, not sure why he’d say that, since Carbonite wasn’t protecting against data loss due to bugs either.
I can see where Carbonite’s coming from: they view themselves as a cheap way to protect data, and it works most of the time. That’s the same approach I take with Amazon S3 cloud storage for my personal backups, incidentally: for a few bucks a month, it’s a cheap insurance policy. I have my data replicated across my laptop, my Time Machine external hard drive, and my VMware server. If there’s a house fire, my stuff might be on Amazon S3. It also might not. Amazon hasn’t made me any promises about the safety and security of my data.
My problem with Carbonite’s approach is that they seemed to take my personal data backups even less seriously than I do, maintaining a single copy in a single place. For all I know, maybe Amazon’s using that exact same approach. Only time and lawsuits will tell.
For more commentary on the Carbonite problems, check out Steve’s blog post, “Head in the Cloud? Or Just up your……..?” (And yes, he apparently stole that from Wilbur.)
