More On the Carbonite Backup Failures

David Friend, the CEO of Carbonite, left a comment on my blog entry about Carbonite.  I’d like to applaud his efforts for taking the time to do that, but the comment raised some ugly questions.  It appears that they were putting data on 15-drive RAID 5 arrays.  RAID 5 is the most cost effective array setup (other than RAID 0, which offers no data protection).

RAID 5 will only tolerate a single drive failure – if you lose more than one drive in an array, the whole array is gone and must be restored from backup.  As SATA drives grow larger and larger in capacity, they take longer and longer to rebuild when one goes bad, because so much data must be copied over to the new drive.  If a second drive fails while the first one is being rebuilt, you’re completely out of luck, and must restore from backup.  The more drives you add in a RAID 5 array, the riskier it gets, because it’s more likely that one of the drives will fail in the time span of the rebuild.  That’s pretty dangerous for a company that makes its living off your backups being available.

Worse, it isn’t clear from the interviews I’ve seen that Carbonite actually backed up your data somewhere other than those RAID 5 arrays.  It appears that their attitude towards backup was that YOUR machine held THEIR backup: when they ran into problems with their RAID arrays, Friend commented that:

“Carbonite automatically restarted all 7,500 backups…”

Meaning, they started getting data again from their clients’ machines, not restored them from tapes or other arrays.  This is further evidenced by interviews that the Enterprise Storage Group conducted with Friend last year before the lawsuit came out, as blogged by Steve Duplessie of the Enterprise Storage Group. When asked how Carbonite protected their backups, Friend replied that:

“This is backup – not archiving, so if that ever happened you’d still have your data on your PC.”

Like Bryan Oliver says, the only reason we do backups is to do restores.  If the answer to restore problems is to use your live server, that’s a failure.  Carbonite didn’t protect against regional internet outages, either:

“Regional internet outages (we use multiple redundant carriers) would take us offline if they all failed. But again, unless you were in the middle of a restore when it happened, you’d probably never notice.”

Translation: if you aren’t using our services, you’ll never notice when we’re down.  How about replication from one site to another?

“…we don’t replicate data across multiple sites. The likelihood of losing data because of software bugs or human error is probably orders of magnitude greater.”

Errr, not sure why he’d say that, since Carbonite wasn’t protecting against data loss due to bugs either.

I can see where Carbonite’s coming from: they view themselves as a cheap way to protect data, and it works most of the time.  That’s the same approach I take with Amazon S3 cloud storage for my personal backups, incidentally: for a few bucks a month, it’s a cheap insurance policy.  I have my data replicated across my laptop, my Time Machine external hard drive, and my VMware server.  If there’s a house fire, my stuff might be on Amazon S3.  It also might not.  Amazon hasn’t made me any promises about the safety and security of my data.

My problem with Carbonite’s approach is that they seemed to take my personal data backups even less seriously than I do, maintaining a single copy in a single place.  For all I know, maybe Amazon’s using that exact same approach.  Only time and lawsuits will tell.

For more commentary on the Carbonite problems, check out Steve’s blog post, “Head in the Cloud? Or Just up your……..?” (And yes, he apparently stole that from Wilbur.)

Previous Post
New PASS 2009 Summit Site Unveiled – #SQLPass
Next Post
Why Would You Virtualize SQL Server?

11 Comments. Leave new

  • Thanks for the explanation Brent. I knew how Raid 5 worked, but never used it with more than 5 drives or so and hadn’t considered the risk as the number of drives increased.

    As far as Carbonite goes, I also want to compliment you for being so diplomatic. I’m not sure they deserve to be treated so benignly.

    Reply
  • I like it, backup != archive, ha ha

    They could go RAID6 to allow 2 disk failures if they are really low on budget
    RAID 6 (striped disks with dual parity) (less common) can recover from the loss of two disks.

    I recently learned about DropBox, free 2GB space that works across multiple OS’, not a bad way to sync/backup stuff either

    Reply
  • I cannot see how Carbonite could’ve made the decision to run 1:15 RAID5’s in the first place, even with their laissez faire attitude on backup vs. archiving. The cost of running RAID6 over RAID5 should be minimal, and the security increases by several magnitudes.

    Furthermore, it worries me that their CEO seems to be less than positive on the actual storage methods & numbers. I hope it’s just a typo, “an improved RAID that allows for the loss of 3 of the 15 drives simultaneously”, if not, I’d like to get that RAID6 array that’ll handle triple disk failures. It could be that our storage array is seriously underperforming, but there’s no way I can rebuild a failed disk on a RAID5/6 15 disk array in “a few hours”.

    Anyways, it’s easy to put blame afterwards, especially if one’s not employed at the company. However, 15 disk RAID5’s are unforgivable, imho.

    Reply
  • This is extremely surprising considering you’re talking about a company that’s entire business model is centered around storage for recovery. I can understand RAID 5. We’re talking files. We’re not talking databases. But not replicating data to other facilities. What? If your business model has multiple locations, use ’em to back each other up.

    Reply
  • I admit, I’m not very familiar with Carbonite’s offering, but one of the features other services offer, is the ability to recover prior versions of files. Let’s say I am working on a document and realize that I liked the version I had last week better, I could go to the service and download the file as it appeared last week.

    If Carbonite doesn’t offer this feature, they are way behind other offerings. If they do offer this feature, how can they possibly think that the customer’s data is a sufficient backup losing a few hard drives in their RAID array???

    Reply
  • Looking for Reliability
    September 11, 2009 7:49 pm

    I have read through all the comments on both pages about Carbonite and am now wondering are there any other companies that offer similar ease of use and price but are run by someone who understands data protection? I can’t believe a company that deals with backups in any way would not be able to tout, multiple redundancy.
    I don’t care if the supposed chances of failure are 1 in 1 or 1 in a googolplex. The technology is there for multi-media, multi-site data replication, they should use it.

    Reply
  • Brent,
    Great follow-up to your “Another backup failure: Carbonite” post. Not sure how I missed it until now. I have been using carbonite on my macbook pro for a couple of months now to backup the usual files (photos, documents, and quickbooks). After reading the comments by the CEO of carbonite and everyone elses input I’m thinking its time to go pickup an Apple TimeCapsule or one of those new Iomega IX2-200.

    Quick question though. Are you doing any kind of online / offsite backups or just keeping everything physically in your house? I get a little worried when I think about all the source code, photos and documents I have spread out between my laptop, desktop and Virtual Machines.

    Reply
  • For nontrivial problems, there are issues with Carbonite’s customer service, too. See http://bit.ly/UgzOA. They try, but for difficult problems, they simply be swamped.

    From what I can see of it, their software isn’t bad, but they’re not giving it a chance.

    Reply
  • i would not recommend Carbonite to anyone…..
    i dont know how many of you have actually had to use your “back-up” that carbonite has stored, but this was our experience with them, we had a raid 5 failure due to the hardware chipset on the motherboard malfunctioning (ASUS mobo)..
    after about 24 hours we started replacing the mobo after figuring out what the actual issue was.
    in this 24 hour period carbonite decided that we didnt need our back-up anymore and deleted it.
    (you have to call them and put a stop/hold on your account for them not to do this)
    all in all it was an extreme headache, i dont know if they are still using this “technique” but its worth checking into if you care about your data, and your trusting it with a company that really seems like they could care less about what you need…

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.