Awards


Another backup failure: Carbonite

TechCrunch reports that Carbonite, an online backup company, lost customer data.

But wait, this is different: it’s not their fault.  They’re suing Promise Technology, makers of popular storage gear, for selling them bogus equipment.  Bogus equipment?  You mean, like hard drives that fail?  That’s horrible!  Who could expect something like that?  Who could know about the dangers that lurk around every corner?

The Statistics are Staggering Alright

The Statistics are Staggering Alright

Carbonite’s web site warns, “You need to be aware that losing your most valuable files is a very real possibility.  You need to take proper precautions.”

Who knew they were referring to their own services?

Don’t point and laugh and say it could never happen to you because you do your own backups in-house, because I’ve seen too many backup strategies fail for too many reasons.  For the love of your own job, never mind your company’s revenue stream, take some time this week to:

  • Automate your backup testing – build a set of T-SQL scripts to automatically restore your production databases onto another server.  Restore a different server every day onto the same target testbed box.
  • Test your backups manually – if you don’t have the time to script the tests, just go run a restore of your largest backup.  Ideally, check the ones that hit tape, because those are the most risky.
  • Check every server’s job logs – I’ve seen so many cases where backups stopped working on a SQL Server, and alerting had also long ago stopped alerting.  These two failures are a 1-2 punch to the jaw of your career.
  • Find your single points of failure – if you’re relying on a single cloud vendor for all of your data protection, that’s a risk.  If you’re backing up straight to tape and you’ve only got one tape jukebox in-house, that’s a risk.
  • Figure out who you’re going to sue – because hey, work is hard.  If you can’t do it right, get rich trying.

22 comments to Another backup failure: Carbonite

  • I find it interesting that you mention backups not working and alerting stopped. I had this situation happen about a a year and a half ago. There was a “departmental” SQL Server that I did not know about that was not UPS protected. During a power outage in August the msdb database was corrupted and marked as suspect. So the backup jobs quit running. In November the hard drive failed. This is when I was brought in to try to “fix” the database. Fortunately the PC support department had an Acronis Image of the box from about a week earlier. I had to rebuild msdb and rebuild all the jobs. The vendor had created backup jobs for the user databases, but not the system DB’s/

  • Fatherjack

    Hopefully this wont be a big issue for the end user – the Carbonite customers.

    Its the backups have been lost – the originals are still intact on the users PC’s/Networks. We get tapes fail now and again, its only a risk while the original is getting a fresh backup done.

    Presumably while the techies get the hardware fixed, the customer service part of Carbonite are in touch with all customers telling them to do a local backup … If they didnt then I would trust them with a copy of my toilet tissue, customer relations needs to be there even if you screwed up. Even if you screwed up because your supplier let you down. Customers dont care about that, they bought your service from you.

  • It would seem that right about now the Empire is reimbursing Boba Fett.

  • I would like to make sure that your readers understand two points with regard to Carbonite’s lawsuit against Promise Technologies:

    1) This event happened over a year ago. We do not say this to minimize the matter. But we do want to point out that this has not happened in a long time and is not an ongoing problem.

    2) The total number of Carbonite customers who were unable to retrieve their data was 54, not 7,500.

    Here is what happened: The Promise servers that we were purchasing in 2006 and 2007 use RAID technology to spread data redundantly across 15 disk drives so that if any one disk drive fails, you don’t lose any data. The RAID software that makes all this work is embedded as “firmware” in the storage servers. In this case, we believe that the firmware on the servers had bugs that caused the servers to crash. Carbonite automatically restarted all 7,500 backups and more than 99% of these were completely restored without incident. Statistically, about 2 out of every 1,000 consumer hard drives will crash every week, so 54 of these customers had their PCs crash before their re-started backups were complete. Since they weren’t completely backed up when their PCs crashed, these customers were unable to restore all of their files from Carbonite. Most of the 54 got some or most of their data back. We took full responsibility for what happened and I did my best to call each of these customers personally to apologize.

    As a result of our problems with the Promise servers, we switched to a popular Dell server that uses RAID6 – an improved RAID that allows for the loss of 3 of the 15 drives simultaneously before you lose any data. This configuration is in theory 36 million times more reliable than a single disk drive — the chances of 3 out of 15 drives failing at the same time are almost nil.

    So far, Promise has refused to accept responsibility for their equipment’s failures, so now we are suing them to get our money back. The Dell RAID servers have been flawless and we’re extremely happy with them.

    Dave Friend, CEO
    Carbonite, Inc.

  • Thanks for commenting! I appreciate the clarification. As you’ve found, though, RAID 5 isn’t enough for serious data protection – especially when you’re running a 15:1 ratio. It’s difficult to rebuild drives fast enough in the event of a drive failure – your window for data loss is pretty unacceptably large.

    It’s okay if you can’t answer more questions – I would completely understand – but the explanation begs more questions, like:
    – Did you only store the data online on the Promise arrays, and not back them up to tape?
    – Did you only store each customer’s data on a single Promise array?

  • Brent: Keep in mind that this is backup, not archiving. If you lose the backup for some reason, the user still has the data on his PC. If you look at RAID6 (you have to lose 3 drive within a few hours of each other) PLUS the user’s PC failing at the same time, when you run the numbers you find that the statistical likelihood of ever losing a RAID6 server at the same time the user’s PC crashes is so close to zero that few users would want to pay more for even further redundancy. Keep in mind that by far the most popular backup method today is the external hard drive. What Carbonite and other online backup vendors offer is already millions of times more reliable. If we were doing archiving, that would be a different story.

    Dave

  • Dave – you’re not answering the questions:
    - Did you only store the data online on the Promise arrays, and not back them up to tape?
    - Did you only store each customer’s data on a single Promise array?

    By not answering them, I can only assume that your answer is no.

    If that’s the case, you and I have a fundamental difference of opinion about what a backup means. If you don’t believe you need reliability, why bother using RAID at all? By the sound of your argument, you could avoid RAID altogether because heck, the customer always has the real data, right? Just fetch it from ‘em again and you’ll be fine.

    We both know that’s not a valid answer, and neither is your answer of banking everything on RAID with no backups.

  • In theory, RAID 6 is nearly bulletproof. In practice, not so much. All it takes is the same set of drives from a bad batch to hose everything up. Or not replacing drives as mean time failure approaches. Especially if you have to powerdown for any amount of time where everything stops spinning.

  • Exactly! And of course, when you’re buying whole servers in groups like Carbonite must be, you’re even more likely to get drives in bad batches.

  • Regarding your bullet list at the end – our team encountered a situation wherein we didn’t have a backup we expected to have, so since that point, nearly 3 years ago, we have a job that checks every database across the entire environment to see when it was last backed up, and alerts us if it finds anything more than 1 day out. It all runs from a table of meta data that allows us to tweak which servers to monitor and even which databases. We don’t care if the replicated database isn’t backed up, but we sure do want to know when the production database isn’t!

  • Chris

    Brent, I’m not sure that the expectation for their customers was the same as a double-redundant enterprise system, although I agree that having only a RAID5 setup w/ a 15:1 ratio wasn’t exactly smart either… Personally, I had issues w/ an inexpensive Promise sata raid controller a few years ago, so not a fan of their hardware either… but it IS pretty cheap!

    If you consider Carbonite’s competition as a single external drive or regular CD/DVD backups, they seem to have much better security at a reasonable price. Compared to a widely-distributed SAN + Tape… yeah, they fall a bit short… but what’s the $$ difference?

  • Chris – I would highly disagree about Carbonite offering better security than a single external drive. Read my recent post about adding reliability to your infrastructure – a 15:1 RAID 5 setup is more complex and possibly less reliable than a single drive (as Carbonite has clearly found out and sued over.)

    http://www.brentozar.com/archive/2009/03/adding-reliability-to-your-infrastructure/

    It’s definitely less reliable than DVD backups, since they repeatedly tout that “We’re backups, not archives”. With plain DVD backups, you at least get archival.

  • vijay

    Carbonite customers’ data loss is not Promise’s fault. For some more context on this case, see Promise’s response in a letter sent to customers this week at http://www.promise.com/support/Announcements.asp.

  • Bruce Goldensteinberg

    I am Bruce Goldensteinberg, the person who brought Carbonite’s unethical activities to NY Times Tech Columnist David Pogue.

    Notice that wherever there is legitimate criticism of carbonite anywhere on the internet, David Friend or one of his lackeys pop up out of nowhere explaining that whatever the issue in question was is not Carbonite’s fault, but instead, the fault of either the user, or in this case, their supplier.

    Not only was it bad enough that carbonite hacks posted reviews on amazon (while being boneheaded- or contemptuous enough of potential customers)using their own names, now they are admitting what many carbonite users have experienced for a while.

    their service is shoddy. the “customer support” is nonexistent. just look around on the internet, look at the reviews of carbonite in many different places- i.e. amazon. now if most of the reviews have the same criticisms, there is likely to be some truth to them. i refuse to believe that all, or even most companies, are so unethical as carbonite and go around planting negative reviews about competitors.

    carbonite is a scam of a company. if they took even a fraction of the money they put into advertising on rush limbaugh and others into building a better product with better equipment and better customer service, they wouldn’t have these issues. but david friend even said in an article on xconomy.com a while back he hopes to sell Carbonite for close to a billion dollars in a few years, based upon the amount EMC paid for Mozy.com (another big, visible backup company) a few years ago

    • Randy

      I found this thread after searching for “carbonite backup criticism”. I heard their ad on the radio and I was curious.

      I am also a business owner and I can tell you, firsthand, that there are just plain incredibly horrible people out there who are absolutely willing to destroy you for one mistake (even if you own up to it).

      I’ve read all the comment here…the reviews… and I think it’s an impressive act when a CEO willingly enters in a discussion like this.

      Damned if it does, damned if he doesn’t. How terrible a company wants to grow and advertises on RUSH LIMAUGH of all places! Wow! Horrific!

      Want to be big business?… you get labeled like you were a nazi.

      Anyone ever send a letter to Apple and get a reply? I have… three attempts… never a reply… they’re the #1 in customer satisfaction (where is the objectivity?)

      Ok, this plainly means to me that a highly significant percentage of customers are not receiving the service.. if that was so I think I’d been seeing quite more complaints.

      I am FAR more willing to give credence to a CEO who takes the time to defend what he/she’s helping create than someone who main weapons are venom, a thesaurus, and have absolutely no idea how much this company spends on whatever they do and it makes no damn difference either way because if the service works, it’s affordable, and reliable then why the hell are you not helping these folks. My local PC repair place gets people coming in with shot drives every damn day. Me too once… shit… for $55 a year I might just use TWO OR THREE of these damn services! Why the hell not!

      • Randy – you’re blurring a couple of distinctions here, and I’ll address your points one by one.

        “there are just plain incredibly horrible people out there who are absolutely willing to destroy you for one mistake (even if you own up to it).”

        If you’re accusing me of being a plain incredibly horrible person, I would probably agree. Carbonite’s “one mistake” was not backing up their data. That’s a very significant, business-ending mistake, especially for a company who’s in the business of backups. If I’m a horrible person for pointing out the fact that a backup company isn’t doing backups, I’m fine with being horrible.

        “I think it’s an impressive act when a CEO willingly enters in a discussion like this.”

        Agreed.

        Rush Limbaugh and Nazi comments

        I dunno where to go with that one, sir.

        “Anyone ever send a letter to Apple and get a reply? I have… three attempts… never a reply…”

        That’s because if you have a problem with Apple, you don’t send letters. You call support or you visit their stores. I’ve had several Apple problems, gone to the nearest store, and gotten hands-on help from the Genius Bar. They’re very helpful and friendly.

        “Ok, this plainly means to me that a highly significant percentage of customers are not receiving the service.. if that was so I think I’d been seeing quite more complaints.”

        Search the web for Carbonite outage, Carbonite problems, or Carbonite sucks, and you get hundreds of thousands of results.

        “I am FAR more willing to give credence to a CEO who takes the time to defend what he/she’s helping create than someone who main weapons are venom, a thesaurus, and have absolutely no idea how much this company spends on whatever they do and it makes no damn difference either way because if the service works, it’s affordable, and reliable then why the hell are you not helping these folks.”

        That’s cool, I understand your opinion. My main weapon, though, isn’t venom or a thesaurus – it’s the fact that I’m a database expert and former storage administrator tasked with backups for a multi-billion dollar enterprise. I pointed out that you can’t safely store data without backing it up, plain and simple, especially if you’re a backup company.

        The fact that you’re saying it’s “reliable” points out that you don’t understand the issue – Carbonite lost customer data. That’s not reliable, period.

        “shit… for $55 a year I might just use TWO OR THREE of these damn services! Why the hell not!”

        Because Carbonite lost customer data, that’s “why the hell not.”

  • Neil Simpson

    Hi Brent. A relative sent me your column about a Carbonite failure. I’m just a little guy but I loved your counsel to the big guys about carefully watching their in-house backups. I don’t understand all the lingo but I know that you’re exactly right. I sent my relative, my kids and other friends & family this email about the importance of in-house & on-line backups for us little guys as well. Even though we’re small, we none-the-less have some really precious files we need to protect. By the way, as a small user I have been quite satisfied with Carbonite. I have had to recover accidentally deleted files. It was easy, intuitive and quick. My thoughts are in the email==>
    Carl, Thanks for forwarding the Ozar article. I hadn’t heard about this particular episode with Carbonite. I had heard of some failures with other on-line backup services so I know that although it’s rare and usually not disastrous, it can happen. The backup company is immediately aware of any failure and quickly notifies customers as well as taking steps to restore the backup.

    These customers in this article appear to be BIG BOYS with huge databases and complex SQL servers. What Brent Ozar was emphasizing is that they need to review their in-house backup procedures and test them regularly to make sure they are not only adequate but actually working and that is more tricky than many of them think. You read some of the comments. But they are supposed to be professional IT support techs. They’d have to be to understand Ozar’s recommendations. He does look and sound like the ultimate techie nerd doesn’t he? I’ll bet he really knows his stuff. I do think he was expecting too much from companies like Carbonite and that he knew it. That’s why he pushed the IT techs so hard for more and better in-house backup procedures. I loved his statement that in-house backup “failures are a 1-2 punch to the jaw of your career”.

    I’m certainly not an IT professional and my databases are relatively tiny but I do have an in-house automatic backup system. It consists of a 500GB external hard drive and a program called Second Copy. Both are always running on my computer. In Second Copy I listed all directories that I wanted backed up and they and all the subdirectories in their tree were copied to the external HD. Second Copy continuously monitors all these directories and subdirectories on my internal HD and if any file in them is modified or new files are added then the changed or new files are immediately backed up to the external drive. So, at all times I have 2 copies of all of my really precious files right on-site. The probability that both of them will go bad simultaneously is so low that a giant meteorite is more likely to hit and destroy all of my photo files (and all life on earth) although simultaneous HD failure IS more likely than winning a state lottery. I kind of like the backup HD to be external so that if the house must be evauated I can quickly grab it and take it with me.

    The reason I pay Carbonite $50/yr. to make a 3rd copy by also backing up my files on-line is that there is a small but real possibility that fire, flood, storm, lightning strike, theft, etc. could wipe out both my internal & external HD’s in one fell swoop. In that unlikely event I have the encrypted Carbonite backup far, far off-site and with RAID6 they actually have 3 copies of my files on their servers. Again, I’m more worried about the meteorite than all 5 copies being destroyed at once by anything else.

    Some say I’m paranoid having so many backups but just because you’re paranoid doesn’t mean that someone is not out to get you :<) I have thousands of digital photos and they’re all precious to me. Now, after reading what Ozar had to say in this article I’m totally convinced that my precautions are not overkill at all even if I only had hundreds of photos. – Neil

  • Carbonite x customer

    I emailed carbonite customer support because my entire online backup was corrupt including the windows files.

    Carbonites answer was
    Hello Mr. X removed X and thank you for contacting Carbonite Customer Support.

    Carbonite backs up the data in the same format. Therefore, if you have corrupted file and backed it up, while trying to restore it you will get the same corrupted data prior to backup. We sincerely apologize that this concern is not related to Carbonite.

    Please let us know if you need additional assistance.

    Sincerely,

    Maxwell
    Carbonite Customer Support

    I sincerely believe that my windows vista machine was corrupt when it was running fine until i installed windows 7 and then tried to download the same files. They were all corrupt but it isn’t carbonites fault. I paid for a service that doesn’t do what it promises. 88 gigs worth of information was rendered useless and corrupt as was their service help and support. So much for a paid service

    Frustrated Carbontie customer

  • Carbonite x customer

    My vista machine was not corrupt. I just made a typo above. I installed 7 for testing purposes

  • I doubt that Carbonite has corrupt files — our backups are checked regularly to make sure they are bit-for-bit identical with what’s on your PC. Please email me and I’ll have someone look at your problem.

    Dave Friend, CEO

  • Bobby Handley

    I recently had a hard drive failure. Been a Carbonite customer for 3+ years. Most of my files were restored, but for some reason Carbonite stopped backing up around September of 2008. This means I have lost all pictures, songs, other files I added to my computer since them. When I clicked on the Carbonite Lock icon it said it had just finished backing up. Based upon this, I assumed I was safe.

    I have gone back and forth with Carbonite customer service so many times, I have lost count. The files are just gone. They keep telling me to send them files again and again to no avail.

    Just a word of warning, with Carbonite you may only get some of your files back.

  • [...] from tape – as long as you are thoroughly familiar with how to use it (as Brent Ozar has explained in this great post, please make sure you run through restore drills…so that the real disaster recovery event itself [...]

Leave a Reply

 

 

 

You can use these HTML tags

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong> <pre lang="" line="" escaped="">