SQL Server Backup Best Practices

I’ve been backing up SQL Servers for almost a decade now, and it’s time to share the lessons I’ve learned. All of this is my own opinion – your mileage may vary – but I’ll try to explain the reasoning behind the choices I make. I won’t address log shipping or snapshots this time around.

Never back up databases to local disk.

If the SQL Server crashes, especially due to a hardware problem or a severe OS problem, the local drives may not be available. In the event of a hardware disaster, I’ve been able to point to my junior guy and say, “Go build a new server and start doing restores from the network share while I try to resuscitate this server.”

Back up databases to a fileshare, then back the share up to tape.

Tape drives these days are fast enough that the vendors like to say DBAs should go straight to tape, and they’re technically right: tape backup & restore speed is not a bottleneck. However, there’s a limited number of drives available. When the DBA needs a restore right away, the tape drives aren’t necessarily sitting idle. At our shop, if multiple people need to do simultaneous restores or backups, there can be a lag time of hours. Disk backups, on the other hand, are always available.

Cost justify the network share with lower licensing costs & simpler backups.

The SAN & backup admins will need cost justification for a new dedicated array just for SQL backups. Tell them that it pays for itself by eliminating the need for backup agents on each SQL Server, plus it simplifies their lives because they can just have one backup policy. Back up everything on that network share once a day, and get it offsite ASAP. They don’t have to worry about what servers are on what schedules or when the peak loads are – they just have one easy task to back up that network share.

Back up to a different SAN if possible.

This one’s not a reality for all shops, but it’s saved my bacon, so I have to mention it. We had our production SAN go down for an entire weekend. I was able to build new database servers using local disk & another SAN, and then restore databases & transaction logs from the network share. The one reason it succeeded was because my backup array was on a different SAN than the production SAN.

My sweet spot for the backup array is raid 10 SATA.

Depending on the backup windows, multiple database servers may be writing backups to that same share simultaneously. Raid 10 gives better write performance than raid 5, and while I’d love to have fiber channel backup drives, it’s cost-prohibitive. Raid 10 on SATA gets me the best balance of cost versus backup throughput. I’ll write a separate article about how to monitor performance metrics during backups to determine where the bottleneck is. Before I started benchmarking, I’d expected my bottleneck to be my gig network cards, but it turned out to be a raid 5 SATA array.

Backup agents like NetBackup and Backup Exec mean giving up scheduling control.

I steer clear of backup agents like Backup Exec, NetBackup and TSM because the schedules are generally dictated by the backup guys instead of the database administrators. I like having the control in the hands of the DBAs because they’re quicker to react to SQL Server problems. For example, if the nightly ETL jobs fail on a data warehouse, the DBAs can quickly pause the backup schedule or restore the databases without input from anybody else. I want my backup guys to concentrate on getting the data to tape and getting it offsite rather than worrying about application problems.

Do regular fire drill rebuilds and restores.

At least once a quarter, do a restore from tape. When the DBA team is separate from the backup administrator, just tell the backup admin that it’s a real restore need, like a developer lost something and needs it back from tape. Restore it.

Build a standalone restore testbed.

Stay buddies with the system administrators. Ask them to keep an eye out for leftover end-of-life servers in the shop that have enough internal drive space to do fire drill restores. I acquired a pretty slow server with a huge number of internal drive bays, picked up a bunch of big drives off Ebay, and presto, I have a restore testbed. One of these days, I’ll automate the fire drill restore process so that each week it restores off the network share, but I haven’t quite gotten that ambitious yet. For now, I do the restores manually.

Keep management informed on restore time estimates.

Every quarter, I look at the backup sizes & speeds, and estimate how long it will take to restore a server from tape. I turn that into a written report to management, and for each server, I give cost estimates to decrease that outage window. Sometimes the answer is third-party backup compression software, sometimes it’s more CPU power for the server, sometimes it’s adding memory, etc. The primary goal is to give them enough information to make a good business decision about whether it’s worth the money. The secondary goal is to CYA: when a server is down, and a restore is taking longer than the business owners would like, then it’s a great time to pull out last quarter’s report and say, “See, I told you this would take an hour to restore, and you said it wasn’t worth the $5,000 to shrink down. Should we reconsider that decision now?”

Trust no one.

Get an external FireWire or USB drive. Once a month, bring it into the office, hook it up to the server hosting the network share, and copy the entire backup share to that drive. It’ll take hours, and that’s OK. Take it home, or in today’s security-conscious world, take it to the bank and put it in a safe deposit box. Nobody should ever need to do this, but sometimes even the best backup administrators make mistakes.

Previous Post
PASS Summit 2007: Day Three Notes
Next Post
SQL Server load testing: synchronized tests versus random ones

80 Comments. Leave new

  • Well written, I really liked the second last point "management report". It made the DBA's presence felt and is a good approach to "slowly" convince management to put more faith in us

    I laughed at the last point though. I don't have enough $$ to buy a big enough drive to copy the backups, and I can't get into data centers either to copy

  • Brent
    in the paragraph above
    My sweet spot for the backup array is raid 10 SATA.
    You said
    I’ll write a separate article about how to monitor performance metrics during backups to determine where the bottleneck is.
    Did you write this somewhere ? I cannot find it
    Please update thanks

  • Forgot to say
    Very nice Article and used it already and cut my backup time from 24 hours to 4 hours (WOW!!!)

  • When you did your benchmarking on the backup process, where did the network overhead ball in the grand scheme of things? Just curious as I am trying to get some metrics around backing up locally vs directly to NAS/SAN…

    cheers and thanks for the great ideas…

  • The post is very helpful for all DBA’s.. Thanks Brent!

  • One more thing about backing up to tape that should be mentioned; tapes are often taken offsite (or should be) incase of a physical data center disaster. While such an event should be rare, there is probably going to be a delay while retrieving the tapes. It’s best to keep an onsite backup on disk.

  • On of the things you mention:

    “Cost justify the network share with lower licensing costs & simpler backups.”

    Are you talking about backups using normal windows share using the CIFS protocol or do you have another technique. I know CIFS backups are not recommended by Microsoft because the physical write to disk is not guaranteed. I have had to rebuild logshipping environments for this very reason

    • Yes, normal Windows shares. Where have you seen that those aren’t recommended by Microsoft? I’d love to see that – it might be a misunderstanding that needs to be cleared up.

  • remote CIFS drives dont constitute what Microsoft would call stable media and as such is occasionally subject to lost write conditions as defined here:


    Ive rexamined the microsoft literature and it actually does say that backing up to network shares can be subject to network errors and backups should be verified.


    I know from experience when setting up logshipping to a remote network share the whole environment can get corrupt when the destination subsytem experiences heavy load.

    • These same guidelines apply to all backup targets. Are you saying you never verify your backups that are written to local disks? I assure you I’ve seen many cases where those have been corrupt as well.

      • By default I normally verify all backups, if I’m doing ad-hoc backups I may not always do this, it depends on size. However by default it seems Microsoft do not configure log shipping this way, hence the corruption, i’ve previously experienced.

        Its good though, revisiting this subject does give me more confidence in backing up to network shares. Im guessing there must be a way to force log shipping to verify its backups.

  • Brent,
    I am one of those rare people it seems like who was a very accomplished Windows AD/Exchange/App Admin and then decided to become a DBA. Above you reference not using backup agents, and I don’t get this. It seems like each DBA wants to do their own style of backups, and as a ex admin I think the standardization is key. Especially since the Oracle, DB2 and Teradata guys do this with no issues at all. It was a pain to setup, but they just restored a 8TB Oracle DB with no issues, so we know it works. Note: we just replaced our whole backup infrastructure, so it’s all new setups right now. SQL is next hopefully.

    It makes it hard to sell SQL Server to management when it has very different backup requirements. Especially since we have non-trusted dev/qa/prod domains that all share the same backup infrastructure. SQL native backups seem to not want to work across untrusted domains to file shares.

    The senior DBA and I get into arguments about this constantly, and I wanted to get your thoughts.


    P.S. I enjoyed the book. I enjoy this blog and read your chapters right away. I took 8 pages of notes that should be very helpful.

    • Hi! Lots of questions in there. I’m heading out on vacation and answering this from my phone, but wanted to jot down a couple of thoughts.

      First, it sounds like you’re saying you can’t sell SQL to management because you have so many choices on how to back it up, whereas the other platforms don’t have any choices. I’d call that an advantage for SQL Server, actually. There’s a ton of ways you can do it, and that flexibility gives SQL DBAs power if they use it wisely.

      SQL backups can work fine over multiple nontristed domains – I’ve got several clients doing this. What specific errors or challenges are you having?

      • Ok, I made that a whole lot more complicated than it needed to be. I really only had one question. First though enjoy your vacation. You can answer this whenever as I was just looking for more experienced DBA’s opinions and your are usually right on.

        The real question from the mess was:

        Why are senior DBA’s so insistent on their native backups versus using the backup plugin of a product like Netbackup, etc.?

        • Ah, gotcha. The reason is that native backups give us much, much, MUCH more flexibility and power. For example, we can do things like log shipping, which let us maintain an disaster recovery server with data as up-to-date as the last 60 seconds – or less. Products like NetBackup can’t touch that.

  • Hi Brent,

    I’m a new SQL Server DBA. I would like to know if there are any GUI tools or scripts available that will make it much easier to monitor backups for SQL Server databases that are hosted on multiple servers?

    I’ve been doing this for three weeks now, so any advice or suggestions would be greatly appreciated.


    • Hi, Jim. I work for Quest Software, and their LiteSpeed for SQL Server does exactly that. It has a central repository that stores the data for all backups across the enterprise in one place. We’ve also got SQL Server Reporting Services reports you can use to let anybody view the backup status too. Here’s the product page:


      I haven’t seen any open source scripts to do the same thing because of the difficulty in gathering backup status across lots of servers. You’d need to build some kind of linked server system or remote querying utility. That gets kind of painful, especially with servers separated by firewalls or in different domains.

      Hope that helps!

  • Brent,

    I greatly appreciate your response.

  • Shane Thompson
    July 28, 2010 7:01 pm

    Hey Brent,

    I actually have a question about backup schedules. Is there any wisdom in staggering your backups so that they aren’t all going on at the same time? Specifically the full backups that are done each night. We have a share that all of our databases write their backups to that is then backed up to tape every night. This is on a SAN with SATA drives in Raid 10… or so I’m told by our SAN admin :-). All of our full backups are scheduled to happen at the same time each night during our off hours. Some of the jobs have been failing reporting back that the network path could no longer be found. I’m fairly certain that it is a network issue or something wrong with the san because when I re-run the jobs that failed they still sometimes fail and this is during a time when the other backups aren’t happening.

    I’ve gotten a little bit off topic here since I wasn’t really looking for an answer to my problem, but it got me thinking about scheduling and I was wondering how you handle your scheduling when you’re dealing with a lot of servers and databases. There’s obviously a limitation to what the drives and the network can do… when is too much too much?

    • Shane – that’s a good question. There definitely is wisdom in staggering them, but to find out if that’s a problem, use Perfmon as I describe in this site. (Search for Perfmon to see the tutorial videos.) Run those counters on the file server hosting the backups, and you’ll find out pretty quickly if your writes are within Microsoft’s guidelines. Hope that helps!

  • I just want to thank you for taking the pain in writing these fine quality posts. You just saved me two days of research!

  • Brent,
    We have been using scripts to back up our databases for several years. This script deletes all backups older than a certain date once the backups are complete.

    Do you use this method, or do you use the RETAINDAYS parameter with your backups? What are the pros and cons of these two methods?

  • have you written the article on “how to monitor performance metrics during backups” as you have mentioned here…

    certainly very useful one.


  • Great post! I’ve been searching around for a backup post like this, especially one that agrees with me. We go straight to tape and yes, just like you said, do not have enough drives to support the madness that could happen if we needed lots of backups all at once.

    Definitely not local disk, but could you get around the server crash problem with a LUN that you could map to another host? Or should I stick with the \\ and try to fix the network bottleneck with more compression?

  • hi,
    i’ve heard that if we take the backup on the following path, that is where the sql server is installed,

    \Microsoft SQL Server\MSSQL.2\MSSQL

    there is a possibility of losing the backup if SQL crashed / corrupted. the folder may not be accessible if SQL crashed / corrupted – is this true?

  • Our 200 server SQL backups of 300TB to network shares, then to tape (as suggested above) is working, however is costly and taxing our infrastructure… We are growing and looking for a better SQL backup solution

    Do any new technologies exist now in 2013 such as NetVault LiteSpeed for SQL , or any other CDP solutions that will cut down on the shear volume of data for Nightly backups?

  • A recent power surge (even going through APC 1500 UPS) has generated some strange things.
    One thing I am unable to figure out is why SQl Server Management Studio started making FULL backups of ALL databases (even inactive ones) to a “temp” folder on the C drive.
    Even when I delete the “temp” folder, it recreates it each night this backup runs.
    The backups in the maintenance plan are still running and being stored on the network.
    This is something new and unusual and I can not find in SQL where this job is located.
    This starting eating up space and I can not find where this job is located.
    It is not in the typical maintenence backup jobs which run and back up to a network drive.
    Any sugestions?
    Everything on the internet says SQL does not back up to a temp folder.
    Perhaps this was done by a prior admin (I have been here 11 months) and was somehow triggered by the event……I have tried contacting the prior admin, but no such luck on an answer.

  • UnBelievably well put together. Incredible stuff. Absolutely incredible! It’s so dead on perfect to what anyone running SQL deals with every day. Obviously that because you guys ARE us. You’re doing the exact same things with real life clients… I am *so* impressed by every single thing your company puts out. I’ve built and run two companies – and it’s CLEAR that you guys have great/strong/confident/[many other great adjectivves] leadership. A company just cannot produce the amount of great stuff you guys do without that kind of management and leadership. I’m not sure how recently you got a simple Thank You, but here’s mine to each of you: Thank You! Kurt.

    • Like Kurt said, you deserve a big THANK YOU not just for the content but also the quality of information provided. More grease to your elbows

      • Bolaji Ogundele
        April 24, 2016 12:47 am

        Good morning Leye, been not just a while, but mighty long a while. I thought about u this morning and got online to search for u. How have u been? This is Bolaji Ogundele from the University of Ilorin days. Enjoy.

  • Jean Tillinghast
    November 19, 2013 2:18 pm

    Great down to earth information. One question for those with a larger amount of experience… I run a full backup of the major databases every night (for us the larger ones are about 25G) to a network drive and send the backup offsite for DR. Up to now we have been saving seven days of full backups on the network drive and just two at our DR site. A full image backup of the server is done weekly. Because of the space issue I was thinking of saving only two on the network drive. Oh, and we only have production, not development, SQL Servers. Is there ever a time that one would be requested to restore a backup greater than two days back?

    The dbs hold our financial, ERP, and other major data. Thanks, jean

    • Jean – that’s actually a question for your business. For example, if someone dropped a table on Friday evening, and nobody noticed until Monday morning, how would that work out?

  • Brent, great article as always! Nicely done. Since this article is 6+ years old now, do you stand by your recommendations? Has anything changed? My new employer backs up right to tape, and the wait types BACKUPBUFFER, BACKUPTHREAD, BACKUPIO, are usually at the top for every SQL Server, so we’re definitely looking into backup up to a network share, then off to tape instead.

    • M – thanks! As you read through the post, what parts do you think have changed and are no longer relevant?

      • I am interested if there are more specific recommendations based on newer SAN technology. For example, for the backup file share, would a simple Windows file server with a gigantic LUN assigned be a good option? (VMware environment on an EMC VNX SAN) I would then use our third party backup tool to go from there to tape.

        • M – the answer depends on your business objectives for RPO and RTO. Building a customized backup plan for your business would be a little beyond the scope of what we could do here in a blog comment, unfortunately.

  • Andrew Notarian
    December 20, 2013 11:31 am

    I have worked places where the unauthorized backup someone had at home saved the company. But its very existence would have gotten the person fired until the major crash and burn happened. We supposed to run BitLocker to go on that bad boy or what?

  • Great article. Someone asked the question if anything has changed. We’re also running a completely Virtual VMWare environment. Everything is running on a SAN running SSD’s and raided to the nth degree. We backup to another drive on the same VM and then copy the backups to a backup server which sits on another SAN.

    One of your reccomendations was to put tempdb on a separate drive. If speed isn’t an issue and given that the data is striped. Does it stil matter as much as it used to?



    • Pete – hmm, I don’t see any recommendations above about putting TempDB on a separate drive, actually. Were you maybe thinking of someone else?

      • Oops. Wrong article. :/

      • Sorry, I had been reading : https://www.brentozar.com/archive/2008/03/sql-server-2005-setup-checklist-part-2-after-the-install/ “Best Practice: Move TempDB to its own drive.”

        SQL server and the evironments we use have come a long way since 2005. Google still finds these articles, and I think there are still some real pearls of wisdom in them. But it’s hard to know if some of these older best practuces are sill valid in todays world.

        • Kendra Little
          October 2, 2014 5:31 pm

          We have some discussion in the comments of that setup post about the tempdb drive issue. Someone asked the same question you have in that article and Brent answered:

          “If TempDB grows suddenly, like if someone creates a giant temp table, how would things work in the different scenarios?”

          I can’t really improve on that answer, so I’ll just quote it. Isolating tempdb isn’t just about IO.

  • Hi Brent,
    I’ve been messing around a little with backups, trying to improve throughput and performance.
    I’m curious as to how adding more RAM can speed up backups? In my testing, going from 4 to 12GB of RAM had a negligible difference.
    Adding more CPU I can understand, and have seen clearly on a live server, but the RAM thing I’m just not able to grasp. Can you elaborate a little (or point me somewhere for more information?)

    I’ve been messing with MAXTRANSFERSIZE and BUFFERCOUNT and again, while I’ve managed to affect the speeds, it hasn’t been a magic bullet as I’d hoped 🙂


    • Andy – in a real-life scenario when you’ve got users hitting the server at the same time the backups are happening, adding memory can reduce the end user workload’s impact on the storage, letting it focus on the backups.

  • Brent,
    You stated in one of your bullet points “I turn that into a written report to management, and for each server, I give cost estimates to decrease that outage window’.

    What is the formula you use for cost estimates?


    • Kenny – I write up the requirements of what work would be required (human, hardware, software) and give that to ’em. It’s not a formula – it’s a project list.

  • Hi Brent,

    Are there any best practice recommendations for SQL Server disk backups i.e. exclude certain file extensions or folders ? Similar to Antivirus exclusion recommendations http://support.microsoft.com/kb/309422

    Most of the SQL Server we build have separate disks (for OS + SQL Binaries, Data, Log, Tempdb, Backup) and I request the INTEL team to only backup OS & Backup drives to tape.

  • My boss is insisting to use Netbackup and I want to use SQL Server native backup, because I know more in SQL Server and not because I know that Netbackup has shortages … he said he won’t listen to me without a whitepaper or a concrete reasons why to leave Netbackup and stay with SQL Server native tools.

    so please list all the WHYs that you can list … and if there is a paper or something that will be much much better.

    • Abu – actually, I’ll turn that around on *you*. How about you make the list, post it here, and then we can collaborate?

      • I’m busy building a list. In our firm we are using the back-up scripts of Ola Hallengren. Beside I also want to be in control (Netbackup Agent will be in control of another team) I have the following pro’s and con’s after two days of testing 🙂 I compare the Netbackup Agent vs SQL backup to local disk which will be picked up by Netbackup file backup afterwards

        Netbackup Agent
        pro 1) restore in 1 step (via the Netbackup tools)
        pro 2) saves local disk space
        con 1) log backup minimum frequency 1 hour
        con 2) restore in consultation with another team
        con 3) if you add a database your first backup will be done while the full backup job runs
        con 4) log job fails on new databases

        SQL Server backup
        pro 1) PITR of max 1 day can be done immediately
        pro 2) log backup minimum frequency 1 minute
        pro 3) new database will be (full) back upped immediately when log backup script runs
        pro 4) you can include verify *)
        con 1) cost local disk space

        I will be working on this the following period. If I discover some new interesting pro’s or cons I will be posting them here
        *) don’t no the possibility of Netbackup yet 🙂

        • Andre – it sounds like this is a great basis for a blog post! Have you thought about starting a blog to share your experience with others? Writing a comment on a years-old post here might not be the best way to get your message out.

          • Hi Brent, you are absolutely right. I was looking for the same list as Abu was looking for and came on my favorite SQL site and thought I should add it. But I have planned to do so (as you probably can see in the url I posted)

  • Brent,

    I currently have scheduled backups to a Network Share, and from there they are both uploaded to the cloud and copied to a portable hard drive. The Network Share itself is also backed up by the SA. I like it this way. I sleep well this way. Our SA wants the SQL backups off the Network Share completely. Instead, he is pushing to management that I could just schedule them to back up directly to the portable hard drive. I don’t like this. It’s just a cheap portable hard drive. I would feel a little better if it was a NAS and fine if it was another SAN. I just have premonitions that backups will fail if the drive has connections or hardware issues. I know those can occur under any setup, but I feel the likelihood will increase. What are your thoughts?



  • Caroline Lavecchia
    July 4, 2017 4:53 pm

    Amazing post Brent Ozar. Tks. Helps me alot.

  • Randy in Marin
    August 9, 2017 7:56 pm

    Hi Brent. I sort of like the idea of Veeam restoring the system, including databases, to the prior night if there is a failure. It’s a lazy way to support a SLA/OLA when last night is good enough. However, I still like the idea of a central share to reduce the VM size.

    I just got out of a meeting where I proposed moving our sql backups from local drives to a single share. The idea was shot down. Besides “If it ain’t broke….”, there were concerns. Would the already stressed backup schedule be adversely affected? Would moving the IO from the private (SAN) network would affect other network traffic? It was also pointed out that if the VM dies, we still have access to the virtual disk drive and can get the backup files. If the SAN dies, then we have the veeam backup – with last night’s files.

    I did not think about skipping the system backup. Guess I was too lazy to think about building 50 SQL Servers to do a full recovery. We could be ready with an image with the same version of SQL. Or perhaps only backup the system when the SQL version changes. I think reducing the load on veeam would be welcome.

    This is an old article. Do you have any updated advice for the new world of hyper-v and veeam?

    • Randy – no, but if you’ve got a specific question I’ll be glad to do what I can. (Just don’t ask “what should my strategy be?” – those are consulting type questions, heh.)

      • Randy in Marin
        August 10, 2017 12:36 pm

        Okay, more specific. Do you recommend
        1. using a separate share for sql backups even if we are virtual?
        2. using separate virtual drives (e.g., data, log, tempdb, etc.) even if they are stored in a single massive array with everything else? (I ask for 7 drives – os, sql system, backup, user data, user log, tempdb data, tempdb log – that seems to earn me a “crazy DBA making work again” look.)
        3. restoring the backups nightly to a test system? (I could request a SQL Server that is the backup fileserver.)
        4. running checkdb on the test system rather than the production system?

  • I have a 60 gb database that is backed over the network share. During restore should i copy to the local drive and restore or I should restore directly from the network share. Please let me know which one is more effective

    • From a time perspective, I’ve never noticed a difference myself. You still have to do the exact same transfer of data. Both methods are equally effective, assuming a completely stable underlying infrastructure.

      Having said that, I far prefer to copy the file locally to the server because it eliminates the network from the restore, especially for larger files.

      There’s nothing more frustrating than getting 95% through a long restore, when the network tweaks and the restore breaks.

    • Ahmed – the only answer is to test on your systems.

  • Bad advice in here about backing up to a network. I wonder how many industry folk just ran with it and now have problems.

    Back up to a different PHYSICAL DISK VOLUME. Not necessarily to the network.

    Backing up to the network can be much slower, you can get contention for traffic, this can blow out your maintenance times and cause knock on effects.

    Back up to a volume that’s a different physical array than your production data. Back up to a volume with redundancy. Back up to somewhere where you’re assured of sensible throughput. Sure, back up off site after that, but don’t knowingly compromise the time it can take to perform regular maintenance.

  • Just wondering if there are any file locations for the SQL Server installation that should be included as part of the backup? Also if the server admin is using a product like Backup Exec, is it best for them to exclude database and log files from the file backups ? I’ve heard that backing up these files directly when the database is in use can cause issues.

  • Jatin kakkar
    March 9, 2022 8:23 am

    I need opinion as we have proposed VEEam backup for our New build servers and SQL servers 2019, wanted to check what would be best backup option s;
    ? do the backups take place with change block tracking on VM side or is agent required on clients (especially on the SQL boxes which backups can often be unique)
    ? Crash Consistent backups vs. application consistent – what is strategy around them since most of these are in use 24×7.
    ? yes likely uses VM CBT for that which is what most due; I would be interested to understand the SQL Backup strategy since 90% of all the servers are SQL based.
    answers to these will be helpful


Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.