Improving the Performance of Backups

There will come a time when your database backups will not be as fast as you would like them to be. Over time, the volume of data you’re storing will grow, and backups will take longer – but your weekly or daily maintenance windows don’t grow as well. Backups can be resource-intensive, using I/O and network resources – and if they’re running when users are trying to work, the users may notice the performance impact.

Here are two ways to make your backups go faster.

Option A: Full and Differential backups

A typical backup scenario that I see is a full backup daily and, if the database is in Full recovery model, log backups throughout the day. But what happens when that full backup begins to take hours, instead of minutes? A differential backup may help.

A full backup contains all the data in the database, and enough of the log to recover it. A differential backup contains only the extents (groups of 8K data pages) that have changed since the last full backup. Because a differential is usually smaller than a full backup, it can take less time to complete.

The commands to run the backups are very similar. A full backup:

To perform a differential backup, you add the WITH DIFFERENTIAL option:

Combining full and differential backups throughout the day or week can reduce the time you spend waiting for backups to complete. There are two scenarios I usually see. The first is a full backup one or two days per week, with differential backups on the other days, and regular transaction log backups. This makes sense when a full backup every day would be too time-consuming. The second is a full backup daily, differentials every few hours, and regular log backups. This usually makes sense when fast recovery is necessary, and it would be better to be able to restore a full backup, a differential backup, and a subset of log backups – rather than the full backup and all log backups for the day.

This can add a layer of complexity to a restore strategy – instead of restoring the last full backup and ensuing logs, the most recent differential must be accounted for as well. You should practice this restore sequence so you understand the steps involved. You also need to monitor the size of the differential backups closely – depending on the rate of change in the database, you may not be saving as much space as hoped.

However, when your time to do backups and have them impact your users is shrinking, yet database size is growing, the full + differential strategy can be used effectively.

Option B: Backing up to multiple files

The standard database or log backup is done to one backup file:

When a backup is written to one file, one thread is assigned. You can back up to multiple files – in one or more locations – and multiple threads will be used. This can make a backup take less time – sometimes significantly. Here’s the command to create a full backup across four disks:

Having multiple backup files can make a restore more complicated. All of the files must be available, and all must be referenced in a restore session. To restore the backup files shown below, use the following command:

However, the benefits can be significant. I performed five tests – one file on one disk, two files on one disk, four files on one disk, two files on two disks, and four files on four disks. The results show that adding more disks decreases the backup time.

backup time chart

Note: these tests don’t cover all scenarios. If you have a group of disks in a RAID 5 or RAID 10 array, multiple threads writing to multiple files could be faster than one thread and one file. Lesson learned: test your backup files in your environment to determine what is best for you!

Previous Post
sp_Blitz® v32: Emailing, Serializable Locking, Simultaneous Agent Jobs, and More
Next Post
Full Scans/sec Doesn’t Mean What You Think

22 Comments. Leave new

  • Henk Vandervalk goes really in-depth on this subject as well going as far to demonstrate the utility of the BLOCKSIZE, BUFFERCOUNT and MAXTRANSFERSIZE along with multiple files:

    http://henkvandervalk.com/how-to-increase-sql-database-full-backup-speed-using-compression-and-solid-state-disks

    These settings really helped me slam a 3TB database to disk for transport to a new machine.

    Reply
  • Jason took the words right out of my mouth. BLOCKSIZE and BUFFERCOUNT can make a huge difference in backup times.

    Reply
  • I’ve used Paul Randal’s script to do targeted differentials, that way if a certain percentage of the database is changed, our backup routines switches and does a full backup.

    http://www.sqlskills.com/blogs/paul/new-script-how-much-of-the-database-has-changed-since-the-last-full-backup/

    Reply
  • Rowland Gosling
    January 22, 2014 9:45 am

    I always liked using multiple files for speed but where it gets tricky is many DBAs have never seen that approach. It complicates a simple, but critical thing and that’s not good. Be sure your emergency documentation is up to date.

    Reply
  • The COMPRESSION also improves backup times – sometimes up to 50%.

    Reply
    • Jes Schultz Borland
      January 29, 2014 8:02 am

      Sometimes it can, yes, but it will increase CPU usage. As always, test these things in development first!

      Reply
  • Anyone done any empirical testing of whether enabling backup COMPRESSION on a database that has PAGE OR ROW level compression either increases the backup time or the backup size?

    From experience, I know trying to gzip a compressed file doesn’t make it smaller.

    Reply
    • Peter – no, it’ll depend on the type of data inside the backup. For example, off-row text data isn’t compressed by page or row compression, so you’d see additional compression there for backup compression.

      Reply
  • Hi Jess… What about doing differential backups when you have a multiple backup file set? MS suggests against using multiple bases for a diff backup. Can you comment on your experience with this? Is it really an administrative nightmare?

    Reply
    • Ron – let’s take a quick step back. What are you doing differential backups for? How large is the database? Could you accomplish your goals by switching to full backups alone, and tuning those?

      Reply
      • Hi Brent – the diffs are used to populate a warm standby server, using MS SQL Server 2008. I separated the files into multiples to make it easier to handle in our space constrained environment. Database is ~375 Gig with 110 Gig transaction log. We want to avoid running full backups during the day so we don’t negatively affect the performance of our Dynamics AX environment.

        Reply
        • Ron – check out transaction log backups. That way you can keep the standby server even closer up to date, and the business will love that. Start in the Books Online section about log shipping. It’s a fairly common tactic for standby servers, it’s really reliable, and I think you’ll love it.

          Reply
          • Thanks Brent. We do transaction log backups every 30 minutes, Diff backups every 4hours. The log shipping option looks interesting, but we didn”t want to degrade server performance and opted to do build the warm standby offline instead.

            My question was related to manageability of diff backups using multiple backup files. Can you comment on that?

          • Log shipping doesn’t degrade server performance. It just uses transaction log backups, and you’re already doing that. If I had to focus on where you should spend your next half-hour learning, I’d strongly recommend that you focus there. Squirrels are cool, but your business would rather lose less than 4 hours of data. Let’s take advantage of those log backups that you’re doing.

  • Thank you Brent. Did you miss the mention of 30 minute transaction log backups? We’re getting no where with my question, no further questions from me. Keep up the good work!

    Reply
  • Ah, Jes! Thank you for confirming what I’ve been trying to get across to some “providers” for some of the smaller companies I help out. Unless you can split the backups to more than 1 physical spindle, then splitting the backups are always going to take longer because of the resulting head-chatter.

    Of course, I’m only 2 years late in finding your good article. Still, “Thank you”.

    Reply
  • Hi Brent,
    Does having multiple split-backup files affect the restore time for the database?

    Reply
  • Hi Brent!
    My laptop has one hard disk.
    Please see the following tests.
    Use Master
    GO

    Backup Database [AdventureWorksDW2016] To Disk = N’D:\Databases\Backup\AdventureWorksDW_20190801_092912.bak’
    /*
    Processed 186688 pages for database ‘AdventureWorksDW2016’, file ‘AdventureWorksDW2016_EXT_Data’ on file 1.
    Processed 4 pages for database ‘AdventureWorksDW2016’, file ‘AdventureWorksDW2016_EXT_Log’ on file 1.
    BACKUP DATABASE successfully processed 186692 pages in 42.211 seconds (34.553 MB/sec).
    */

    Backup Database [AdventureWorksDW2016] To
    Disk = N’D:\Databases\Backup\AdventureWorksDW_20190801_093010_1.bak’,
    Disk = N’D:\Databases\Backup\AdventureWorksDW_20190801_093010_2.bak’,
    Disk = N’D:\Databases\Backup\AdventureWorksDW_20190801_093010_3.bak’,
    Disk = N’D:\Databases\Backup\AdventureWorksDW_20190801_093010_4.bak’

    /*
    Processed 186688 pages for database ‘AdventureWorksDW2016’, file ‘AdventureWorksDW2016_EXT_Data’ on file 1.
    Processed 2 pages for database ‘AdventureWorksDW2016’, file ‘AdventureWorksDW2016_EXT_Log’ on file 1.
    BACKUP DATABASE successfully processed 186690 pages in 45.583 seconds (31.996 MB/sec).
    */

    Backup Database [AdventureWorksDW2016] To
    Disk = N’D:\Databases\Backup\AdventureWorksDW_20190802_092912.bak’ With Compression
    /*
    Processed 186688 pages for database ‘AdventureWorksDW2016’, file ‘AdventureWorksDW2016_EXT_Data’ on file 1.
    Processed 2 pages for database ‘AdventureWorksDW2016’, file ‘AdventureWorksDW2016_EXT_Log’ on file 1.
    BACKUP DATABASE successfully processed 186690 pages in 37.708 seconds (38.679 MB/sec).
    */

    Backup Database [AdventureWorksDW2016] To
    Disk = N’D:\Databases\Backup\AdventureWorksDW_20190802_093010_1.bak’,
    Disk = N’D:\Databases\Backup\AdventureWorksDW_20190802_093010_2.bak’,
    Disk = N’D:\Databases\Backup\AdventureWorksDW_20190802_093010_3.bak’,
    Disk = N’D:\Databases\Backup\AdventureWorksDW_20190802_093010_4.bak’ With Compression
    /*
    Processed 186688 pages for database ‘AdventureWorksDW2016’, file ‘AdventureWorksDW2016_EXT_Data’ on file 1.
    Processed 2 pages for database ‘AdventureWorksDW2016’, file ‘AdventureWorksDW2016_EXT_Log’ on file 1.
    BACKUP DATABASE successfully processed 186690 pages in 33.247 seconds (43.869 MB/sec).
    */

    Reply
  • I think your testing pretty much proves what many of us have said in the past. If you cannot guarantee that each file will be written to a separate physical spindle, you’re just slowing things down and making restores a little more complicated. Having compression enabled does seem to contradict that a bit. I, too, strongly believe in the use of compression for backups but you might want to augment that by looking into the BUFFERCOUNT and MAXTRANSFERSIZE settings of the BACKUP command.

    On my production server, I’m currently using a BUFFERCOUNT of 17, a MAXTRANSFERSIZE of 1048576, all single file destinations, and backing up 3.6TB in an average of 2.5 hours. There are a lot of folks out there that do even better but understand that I’m also doing this via NAS (Network Attached Storage) to ensure that I’m my backups are not on the same physical machine that I’m backing up.

    Although I don’t remember exactly how long it used to take without those settings, I remember being seriously impressed with the difference those settings made. You’ll need to tweak the settings (like I did) through a bit of experimentation to get the best performance.

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.