SQL Server Backup Software: Part 1 – Why Native SQL Backups Suck

Last Updated 7 years ago

Before we start looking at SQL Server backup compression software, we need to spend a few minutes looking at the weaknesses of the native SQL Server backup process. In order to judge the fixes, we have to know what’s broken.

Native SQL Server backups take the same disk space as the data.

When we back up 100gb of data with a native backup, we’ll end up with a 100gb backup file. If a database has 100gb allocated, but it’s half empty (like in the case of unused log files), then the backup size will be roughly 50gb – the size of the data.

Large amounts of data take a long time to write to disk.

The slowest thing in the backup process is usually writing the backup file, whether it’s over the network or to local disk. Reads are typically faster than writes, so unless the database is under heavy transactional load at the time of the backup, the reads won’t be the bottleneck. As a result, the more data that has to get written to disk, the longer the backup will take.

We could alleviate that by purchasing faster and faster arrays for our backup targets, but that gets pretty expensive. Our managers start to ask why the DBA’s fastest raid array is being used for backups instead of the live data!

Large amounts of data take a REALLY long time to push over a network to a DR site.

This affects log shipping or just plain copying backup files over the WAN. Compressing the data as little as 25% cuts transmission times by that same amount, and cuts the amount of bandwidth required to replicate the application data. In a large enterprise where multiple applications are competing for the same WAN bandwidth pipe, other teams will ask why the SQL DBA can’t compress their data before sending it over the wire.

We can work around that problem by installing WAN optimization hardware like a Cisco WAAS appliance, but these have their own drawbacks. They must be installed on both ends of the network (the primary datacenter and the DR site), require a lot of management overhead, and they’re expensive. Really expensive.

Another workaround is to compress the backup files with something like WinZip after the backup has finished, but that’s a manual process that has to be automated by the DBA, actively managed, and adds a lag time for the compression before the data can be sent offsite.

SQL Management Studio doesn’t come with reports about the backup process.

Business folks like to say, “You get what you measure.” The idea is that if you start numerically measuring something in an objective way, that number will start to improve simply because you’re focusing on it and talking about it with others. SQL Server native backups are something of a black box: there’s no quick report to show how long backups are taking per database, how often they’re failing, and how long it would take to do a full restore in the event of an emergency.

I find it hilariously ironic that my job as a database administrator revolves around storing precise metrics for others, enabling them to do dashboards and reports, but SQL’s native backup system doesn’t offer any kind of dashboard or report to show its own backup & restore times and successes. SQL 2005 SP2 started to offer some database performance reports inside of SQL Server Management Studio, but they still don’t address the backup/restore metrics.

An ambitious DBA could build their own reports, but they have to manually consolidate data from all of their database servers across the enterprise and keep it in sync. Whew – I get tired just thinking about that. (I should probably subtitle my blog as “The Lazy DBA” come to think of it.)

Cross-server restores are a ton of manual work.

If the DBA wants to bring a development server up to the most current production backup, including transaction logs, they either have to write a complicated script to parse through a list of available backups, or they have to do a lot of manual restores by picking files.

Even worse, in the event of a disaster, the database administrator has to scramble through directories looking for t-logs, writing restore scripts, and hoping they work. Really good DBA’s plan this scenario out and test it often, but let’s be honest: most of us don’t have that much time. We write our DRP scripts once, test them when we have to, and cross our fingers the rest of the time.

That frustrates me because the restore process has been the same since I started doing database administration back in 1999 as a network admin. For years, I looked for the most reliable restore scripts I could find, played with them, spent time tweaking them, and wasted a lot of time. In my mind, this is something that should be completely integrated with the shipping version of SQL Server just because it’s so central to a DBA’s job.

Enough Backup Problems: Show Me Solutions!

So now we’ve seen some of the weaknesses in SQL Server 2005’s native backups. In my next couple of blog posts, I’ll talk about how third party backup compression software gets around these obstacles and offers better features with more capabilities.

Continue Reading with Part 2: Quad Cores are Changing the Game

6 Comments. Leave new

Hugo Shebbeare
August 18, 2009 11:57 am

Last year I spent a good couple of months on disaster recovery.
what I ended up doing was using compression software for backups and robocopying them to a remote side, followed by a restore through regular job intervals:
http://www.sqlservercentral.com/blogs/hugo/archive/2009/02/28/an-optimised-disaster-recovery-plan-thanks-to-sqlbackup.aspx
Simple Talk will be publishing shortly (end of Sept. latest) a full version of this style of DRP, and it will be linked to from that article.
Thanks for all your great tips and impartial opinion.

Reply
- Brent Ozar
  August 18, 2009 12:31 pm
  
  Interesting. How did you handle alerting when jobs failed or Robocopy wasn’t able to copy the files?
  
  Reply
Hugo Shebbeare
August 19, 2009 5:01 pm

Database Mail was configured for the alerts, and I would set an auto-retry if there was a failure after 2x the regular time run for the robocopy – actually now, I really prefer SQL Response since it will indicate an extra long job step running right away.
auto-retry post here: http://www.sqlservercentral.com/blogs/hugo/archive/2009/05/27/configuring-auto-retry-on-sql-server-agent.aspx#comments

Reply
Keith Mescha
February 22, 2011 1:11 pm

Brent, curious if you thought to update this with the introduction of SQL2008 and compression. Does this change the game at all? Maybe you have already wrote about it and I’ll get to at some point.

Reply
- David Partlow
  July 17, 2018 5:27 pm
  
  several years later, was there an update? i like having control (yes, i could stop this statement there) of my backups. the Storage Team has control over the NetBackups. i would rather run native (OLA), than third party.
  Brent, Thoughts?
  
  Reply
  - Brent Ozar
    July 18, 2018 4:17 am
    
    No, we tend not to go back and update stuff that’s over a decade old. What specific questions did you have?
    
    Reply