Production DBA

Improving the Performance of Backups

By Jes Schultz · January 21, 2014 · 22 comments

There will come a time when your database backups will not be as fast as you would like them to be. Over time, the volume of data you’re storing will grow, and backups will take longer – but your weekly or daily maintenance windows don’t grow as well. Backups can be resource-intensive, using I/O and network resources – and if they’re running when users are trying to work, the users may notice the performance impact.

Here are two ways to make your backups go faster.

Option A: Full and Differential backups

A typical backup scenario that I see is a full backup daily and, if the database is in Full recovery model, log backups throughout the day. But what happens when that full backup begins to take hours, instead of minutes? A differential backup may help.

A full backup contains all the data in the database, and enough of the log to recover it. A differential backup contains only the extents (groups of 8K data pages) that have changed since the last full backup. Because a differential is usually smaller than a full backup, it can take less time to complete.

The commands to run the backups are very similar. A full backup:

BACKUP DATABASE AdventureWorks2012
TO DISK=N'E:\SQL Backups\AdventureWorks2012\AdventureWorks2012_Full.bak';

1 2	BACKUP DATABASE AdventureWorks2012 TO DISK=N'E:\SQL Backups\AdventureWorks2012\AdventureWorks2012_Full.bak';

To perform a differential backup, you add the WITH DIFFERENTIAL option:

BACKUP DATABASE AdventureWorks2012
TO DISK=N'E:\SQL Backups\AdventureWorks2012\AdventureWorks2012_Diff.bak'
WITH DIFFERENTIAL;

BACKUP DATABASE AdventureWorks2012

TO DISK=N'E:\SQL Backups\AdventureWorks2012\AdventureWorks2012_Diff.bak'

WITH DIFFERENTIAL;

Combining full and differential backups throughout the day or week can reduce the time you spend waiting for backups to complete. There are two scenarios I usually see. The first is a full backup one or two days per week, with differential backups on the other days, and regular transaction log backups. This makes sense when a full backup every day would be too time-consuming. The second is a full backup daily, differentials every few hours, and regular log backups. This usually makes sense when fast recovery is necessary, and it would be better to be able to restore a full backup, a differential backup, and a subset of log backups – rather than the full backup and all log backups for the day.

This can add a layer of complexity to a restore strategy – instead of restoring the last full backup and ensuing logs, the most recent differential must be accounted for as well. You should practice this restore sequence so you understand the steps involved. You also need to monitor the size of the differential backups closely – depending on the rate of change in the database, you may not be saving as much space as hoped.

However, when your time to do backups and have them impact your users is shrinking, yet database size is growing, the full + differential strategy can be used effectively.

Option B: Backing up to multiple files

The standard database or log backup is done to one backup file:

BACKUP DATABASE AdventureWorks2012
TO DISK=N'E:\SQL Backups\AdventureWorks2012\AdventureWorks2012_Full_20140109.bak'
WITH Name = 'AdventureWorks2012 Full One File';

BACKUP DATABASE AdventureWorks2012

TO DISK=N'E:\SQL Backups\AdventureWorks2012\AdventureWorks2012_Full_20140109.bak'

WITH Name = 'AdventureWorks2012 Full One File';

When a backup is written to one file, one thread is assigned. You can back up to multiple files – in one or more locations – and multiple threads will be used. This can make a backup take less time – sometimes significantly. Here’s the command to create a full backup across four disks:

BACKUP DATABASE AdventureWorks2012
TO DISK=N'E:\SQL Backups\AdventureWorks2012\AdventureWorks2012_Full_1__20140109.bak',
DISK=N'F:\SQL Backups\AdventureWorks2012\AdventureWorks2012_Full_2__20140109.bak',
DISK=N'G:\SQL Backups\AdventureWorks2012\AdventureWorks2012_Full_3__20140109.bak',
DISK=N'H:\SQL Backups\AdventureWorks2012\AdventureWorks2012_Full_4__20140109.bak'
WITH Name = 'AdventureWorks2012 Full 4 Files';

BACKUP DATABASE AdventureWorks2012

TO DISK=N'E:\SQL Backups\AdventureWorks2012\AdventureWorks2012_Full_1__20140109.bak',

DISK=N'F:\SQL Backups\AdventureWorks2012\AdventureWorks2012_Full_2__20140109.bak',

DISK=N'G:\SQL Backups\AdventureWorks2012\AdventureWorks2012_Full_3__20140109.bak',

DISK=N'H:\SQL Backups\AdventureWorks2012\AdventureWorks2012_Full_4__20140109.bak'

WITH Name = 'AdventureWorks2012 Full 4 Files';

Having multiple backup files can make a restore more complicated. All of the files must be available, and all must be referenced in a restore session. To restore the backup files shown below, use the following command:

RESTORE DATABASE AdventureWorks2012
FROM DISK=N'E:\SQL Backups\AdventureWorks2012\AdventureWorks2012_Full_1__20140109.bak',
DISK=N'F:\SQL Backups\AdventureWorks2012\AdventureWorks2012_Full_2__20140109.bak',
DISK=N'G:\SQL Backups\AdventureWorks2012\AdventureWorks2012_Full_3__20140109.bak',
DISK=N'H:\SQL Backups\AdventureWorks2012\AdventureWorks2012_Full_4__20140109.bak'
WITH NORECOVERY;

RESTORE DATABASE AdventureWorks2012
WITH RECOVERY;

RESTORE DATABASE AdventureWorks2012

FROM DISK=N'E:\SQL Backups\AdventureWorks2012\AdventureWorks2012_Full_1__20140109.bak',

DISK=N'F:\SQL Backups\AdventureWorks2012\AdventureWorks2012_Full_2__20140109.bak',

DISK=N'G:\SQL Backups\AdventureWorks2012\AdventureWorks2012_Full_3__20140109.bak',

DISK=N'H:\SQL Backups\AdventureWorks2012\AdventureWorks2012_Full_4__20140109.bak'

WITH NORECOVERY;

RESTORE DATABASE AdventureWorks2012

WITH RECOVERY;

However, the benefits can be significant. I performed five tests – one file on one disk, two files on one disk, four files on one disk, two files on two disks, and four files on four disks. The results show that adding more disks decreases the backup time.

Note: these tests don’t cover all scenarios. If you have a group of disks in a RAID 5 or RAID 10 array, multiple threads writing to multiple files could be faster than one thread and one file. Lesson learned: test your backup files in your environment to determine what is best for you!

Free, 3× a week

Get my new posts by email

Three posts a week, plus a Monday roundup of the best database news from around the web.

22 comments

Jason Carter

January 21, 2014 at 7:07 am

Henk Vandervalk goes really in-depth on this subject as well going as far to demonstrate the utility of the BLOCKSIZE, BUFFERCOUNT and MAXTRANSFERSIZE along with multiple files:

http://henkvandervalk.com/how-to-increase-sql-database-full-backup-speed-using-compression-and-solid-state-disks

These settings really helped me slam a 3TB database to disk for transport to a new machine.

Reply
Colleen Morrow

January 21, 2014 at 1:10 pm

Jason took the words right out of my mouth. BLOCKSIZE and BUFFERCOUNT can make a huge difference in backup times.

Reply
Jeff

January 21, 2014 at 4:20 pm

I’ve used Paul Randal’s script to do targeted differentials, that way if a certain percentage of the database is changed, our backup routines switches and does a full backup.

http://www.sqlskills.com/blogs/paul/new-script-how-much-of-the-database-has-changed-since-the-last-full-backup/

Reply
1. Brent Ozar
  
  January 22, 2014 at 8:06 am
  
  Jeff – nice, that’s the same thing LiteSpeed’s SmartDifferential backups and SQL 2014’s Smart Backups do too.
  
  Reply
Rowland Gosling

January 22, 2014 at 9:45 am

I always liked using multiple files for speed but where it gets tricky is many DBAs have never seen that approach. It complicates a simple, but critical thing and that’s not good. Be sure your emergency documentation is up to date.

Reply
Simon Murin

January 28, 2014 at 5:37 pm

The COMPRESSION also improves backup times – sometimes up to 50%.

Reply
1. Jes Schultz Borland
  
  January 29, 2014 at 8:02 am
  
  Sometimes it can, yes, but it will increase CPU usage. As always, test these things in development first!
  
  Reply
Peter Tran Student since 2017

October 19, 2015 at 10:49 am

Anyone done any empirical testing of whether enabling backup COMPRESSION on a database that has PAGE OR ROW level compression either increases the backup time or the backup size?

From experience, I know trying to gzip a compressed file doesn’t make it smaller.

Reply
1. Brent Ozar
  
  October 19, 2015 at 1:30 pm
  
  Peter – no, it’ll depend on the type of data inside the backup. For example, off-row text data isn’t compressed by page or row compression, so you’d see additional compression there for backup compression.
  
  Reply
Ron

January 25, 2016 at 11:00 am

Hi Jess… What about doing differential backups when you have a multiple backup file set? MS suggests against using multiple bases for a diff backup. Can you comment on your experience with this? Is it really an administrative nightmare?

Reply
1. Brent Ozar
  
  January 25, 2016 at 11:07 am
  
  Ron – let’s take a quick step back. What are you doing differential backups for? How large is the database? Could you accomplish your goals by switching to full backups alone, and tuning those?
  
  Reply
  1. Ron
    
    March 14, 2016 at 12:55 pm
    
    Hi Brent – the diffs are used to populate a warm standby server, using MS SQL Server 2008. I separated the files into multiples to make it easier to handle in our space constrained environment. Database is ~375 Gig with 110 Gig transaction log. We want to avoid running full backups during the day so we don’t negatively affect the performance of our Dynamics AX environment.
    
    Reply
    1. Brent Ozar
      
      March 15, 2016 at 5:45 am
      
      Ron – check out transaction log backups. That way you can keep the standby server even closer up to date, and the business will love that. Start in the Books Online section about log shipping. It’s a fairly common tactic for standby servers, it’s really reliable, and I think you’ll love it.
      
      Reply
      1. Ron
        
        March 15, 2016 at 6:09 am
        
        Thanks Brent. We do transaction log backups every 30 minutes, Diff backups every 4hours. The log shipping option looks interesting, but we didn”t want to degrade server performance and opted to do build the warm standby offline instead.
        
        My question was related to manageability of diff backups using multiple backup files. Can you comment on that?
      2. Brent Ozar
        
        March 15, 2016 at 6:25 am
        
        Log shipping doesn’t degrade server performance. It just uses transaction log backups, and you’re already doing that. If I had to focus on where you should spend your next half-hour learning, I’d strongly recommend that you focus there. Squirrels are cool, but your business would rather lose less than 4 hours of data. Let’s take advantage of those log backups that you’re doing.
Ron

March 15, 2016 at 6:33 am

Thank you Brent. Did you miss the mention of 30 minute transaction log backups? We’re getting no where with my question, no further questions from me. Keep up the good work!

Reply
Jeff Moden

March 31, 2016 at 8:04 pm

Ah, Jes! Thank you for confirming what I’ve been trying to get across to some “providers” for some of the smaller companies I help out. Unless you can split the backups to more than 1 physical spindle, then splitting the backups are always going to take longer because of the resulting head-chatter.

Of course, I’m only 2 years late in finding your good article. Still, “Thank you”.

Reply
Pramod Kumar

November 15, 2018 at 6:37 pm

Hi Brent,
Does having multiple split-backup files affect the restore time for the database?

Reply
1. Brent Ozar
  
  November 17, 2018 at 8:43 am
  
  Pramod – why not run an experiment yourself to find out? That way you’ll have a definitive answer for your own environment.
  
  Reply
Mehdi

August 2, 2019 at 8:55 pm

Hi Brent!
My laptop has one hard disk.
Please see the following tests.
Use Master
GO

Backup Database [AdventureWorksDW2016] To Disk = N’D:\Databases\Backup\AdventureWorksDW_20190801_092912.bak’
/*
Processed 186688 pages for database ‘AdventureWorksDW2016’, file ‘AdventureWorksDW2016_EXT_Data’ on file 1.
Processed 4 pages for database ‘AdventureWorksDW2016’, file ‘AdventureWorksDW2016_EXT_Log’ on file 1.
BACKUP DATABASE successfully processed 186692 pages in 42.211 seconds (34.553 MB/sec).
*/

Backup Database [AdventureWorksDW2016] To
Disk = N’D:\Databases\Backup\AdventureWorksDW_20190801_093010_1.bak’,
Disk = N’D:\Databases\Backup\AdventureWorksDW_20190801_093010_2.bak’,
Disk = N’D:\Databases\Backup\AdventureWorksDW_20190801_093010_3.bak’,
Disk = N’D:\Databases\Backup\AdventureWorksDW_20190801_093010_4.bak’

/*
Processed 186688 pages for database ‘AdventureWorksDW2016’, file ‘AdventureWorksDW2016_EXT_Data’ on file 1.
Processed 2 pages for database ‘AdventureWorksDW2016’, file ‘AdventureWorksDW2016_EXT_Log’ on file 1.
BACKUP DATABASE successfully processed 186690 pages in 45.583 seconds (31.996 MB/sec).
*/

Backup Database [AdventureWorksDW2016] To
Disk = N’D:\Databases\Backup\AdventureWorksDW_20190802_092912.bak’ With Compression
/*
Processed 186688 pages for database ‘AdventureWorksDW2016’, file ‘AdventureWorksDW2016_EXT_Data’ on file 1.
Processed 2 pages for database ‘AdventureWorksDW2016’, file ‘AdventureWorksDW2016_EXT_Log’ on file 1.
BACKUP DATABASE successfully processed 186690 pages in 37.708 seconds (38.679 MB/sec).
*/

Backup Database [AdventureWorksDW2016] To
Disk = N’D:\Databases\Backup\AdventureWorksDW_20190802_093010_1.bak’,
Disk = N’D:\Databases\Backup\AdventureWorksDW_20190802_093010_2.bak’,
Disk = N’D:\Databases\Backup\AdventureWorksDW_20190802_093010_3.bak’,
Disk = N’D:\Databases\Backup\AdventureWorksDW_20190802_093010_4.bak’ With Compression
/*
Processed 186688 pages for database ‘AdventureWorksDW2016’, file ‘AdventureWorksDW2016_EXT_Data’ on file 1.
Processed 2 pages for database ‘AdventureWorksDW2016’, file ‘AdventureWorksDW2016_EXT_Log’ on file 1.
BACKUP DATABASE successfully processed 186690 pages in 33.247 seconds (43.869 MB/sec).
*/

Reply
1. Brent Ozar
  
  August 3, 2019 at 7:53 pm
  
  I have USB thumb drives faster than that. Don’t bother testing on something that you wouldn’t use to host a production database server.
  
  Reply
Jeff Moden

August 3, 2019 at 8:36 am

I think your testing pretty much proves what many of us have said in the past. If you cannot guarantee that each file will be written to a separate physical spindle, you’re just slowing things down and making restores a little more complicated. Having compression enabled does seem to contradict that a bit. I, too, strongly believe in the use of compression for backups but you might want to augment that by looking into the BUFFERCOUNT and MAXTRANSFERSIZE settings of the BACKUP command.

On my production server, I’m currently using a BUFFERCOUNT of 17, a MAXTRANSFERSIZE of 1048576, all single file destinations, and backing up 3.6TB in an average of 2.5 hours. There are a lot of folks out there that do even better but understand that I’m also doing this via NAS (Network Attached Storage) to ensure that I’m my backups are not on the same physical machine that I’m backing up.

Although I don’t remember exactly how long it used to take without those settings, I remember being seriously impressed with the difference those settings made. You’ll need to tweak the settings (like I did) through a bit of experimentation to get the best performance.

Reply

Improving the Performance of Backups

Option A: Full and Differential backups

Option B: Backing up to multiple files

Get my new posts by email

Keep digging

22 comments

Leave a comment Cancel reply