Database administrators: are you backing up your SQL Servers to local drives? If so, you need to stop for a minute to think about a few possible side effects, and maybe think about backing up to a network share instead. Why? I’m glad you asked!
Get your data off the server faster
The only reason you back up is to be able to restore. If your server crashes, you need to be able to restore as fast as possible. In the event of a hardware or operating system error, if your data is on a local drive on that dead SQL Server, you’re going to take much longer to do a restore. You might be running into the datacenter to go pull out the hard drives, or maybe you might spend valuable time trying to resuscitate a dead operating system.
Instead, if the backups are on a network share, you can immediately start restoring those backups onto an other server. The faster you can get back online after disaster strikes, the better you look. Then you can take your time troubleshooting the dead gear.
Restore your data to dev/QA faster
Do you frequently refresh your development and testing SQL Servers from production backups? You can do this faster and with less impact to your production servers if those backup files live on a network share instead of on your production server.
Easier log shipping with multiple subscribers
When you have multiple log shipping subscribers pointing to the same publisher, they all need to read those same backups from your central publisher server. The more subscribers you have, the more impact it is on the production publisher because all of those subscribers will be copying the data off the publisher.
Save yourself performance by simply writing the backup to a network share in the first place. Then the write is done once, it’s off the production publisher, and it can go on doing what it needs to do: serve database users.
Faster writes to long term storage
If you’re writing backups to disk first, and then offloading those backups to a cloud provider or tape, you suffer a performance hit whenever that backup agent connects to grab the backup files from local disk. You’ll get a performance improvement if you write those files off to the network to begin with, because then the backup software can grab the files off the network share – hammering your file server instead of your valuable database server.
Your bottleneck probably isn’t the network card
Sometimes I hear DBAs say that they’d rather dump the data onto local disk first to get the backup over and done with faster. Sometimes – but rarely – is that actually true. Many times, especially if you’re using backup compression, the network isn’t the bottleneck.
Want to prove it? Try it yourself – do a backup to a known fast network share, not some slow junky machine in the antique section of the datacenter. Time to see how long it takes to finish the backup as compared to writing to local disk. If it takes less time, you should switch to the share right away. Even if it takes 10-20-30% longer, I would still argue that you should switch because the faster you can get those backups off your server, the safer you are. And I like you – I don’t want you getting fired for not being able to recover your data! I can’t afford to lose any of my blog readers.
I respectfully disagree. I think backups should go locally and then to the network or tape. The network is always the slowest hardware subsystem on any enterprise server. A local copy is even more important if you are going to a slllloooooowwww tape.
So yah get it off the server but also keep a recent copy to recover from corruption\user error\etc.
And yer mamma dresses you funny.
Son, you ain’t never respectfully disagreed with nobody. 😉
I’ll give you a few examples of when it’s not the slowest hardware subsystem on an enterprise server:
– Virtual servers with shared luns (where IO can be slower)
– When the backup target storage isn’t fast (like raid 5 SATA)
– No SAN multipathing, but multiple NICs (not that uncommon since teaming software is free)
Ok on the VM’s. Maybe on the teaming but most configurations I have seen are active\passive.
I am guessing here but doesn’t a single 5400rpm hard drive have more throughput than a 1GB network connection.
All that said, I know many customers who backup directly to 100mb unc shares due to budgetary constraints. I bet that changes the first time they have to restore. 🙂
It’s surprising, but no – a single 5400 can only saturate a network connection in very short bursts. You have to be able to sustain the load for long periods. Then, toss in the backup compression you get from LiteSpeed or SQL 2008’s backup compression, and the amount of writes are cut by 70-80%, so you need an army of drives doing the reads in order to saturate a single network connection for the writes.
I backup to a local drive and also push to a network share that is backed up to tape. It’s easier and quicker to find the backup on the local drive than have to find the tape that might be offsite. Of course if the server dies than the network share backup is invaluable.
I backup to a local drive too.
Then I have everything go to TWO file shares. One in the same building, and one offsite on a slow network connection.
I have a scheduled batch file script that calls PKZIP to create the copy for the slow network connection that is chopped into individual 1GB files. This minimizes the disruption caused when a file transfer gets interrupted.
Then Robocopy on each of the respective separate machines pulls the files directly to the network shares.
I have 14 DB’s that I backup to a local drive using the SQL 2008 compress feature which is awesome, I also keep upto 1 month worth of DB backups for each DB just in case I wanna go back for God knows what reason 😉 After that whole circus process is done I push everything to a NAS running RAID 1 . been working perfect so far.
While I agree that getting the backup of to a network share is best in the long run, currently the network share setup for our SQL backups takes 12 – 14 hours to backup the server. Backing up to a local drive only takes about 30 minutes. I would not mind if the network backup took an hour or two. I work for an organization that is stuck in the past. IF it works and its cheap then it is just fine.
Bob – interesting. So do you ever copy the backups off the local disks to somewhere else? Or are you just hoping the motherboard and RAID card never fail?
Firstly, Running a database on local disk? It does depend on the database and the underlaying architecture…but I’ve just started consulting at a place which does this and the database is thrashing the local disk and controller badly causing read and write times to be really slow…
Anyway, off topic 🙂
In terms of what your saying, in principle I agree but how would backing up to network save you if the server went offline during the back up process?
The only time it would save you hassle is if you ONLY kept backups locally and NEVER backed up to tape, third-hand storage, etc.
Also, network packet loss is the devil’s work – this is a risk to your backups if using network shares to backup to. This is more of a risk pre-2008 days though (SQL 2008 has the lovely backup checksum option) – but is worth a mention.
As a wise man once told me: When is your backup considered a success? When you can restore from it.
Good work Brent – love the blog – very thought prevoking.
@TheDatabaseZealot – if your SQL Server frequently goes offline, I would argue that you need to solve that problem pretty quickly regardless. 😉
It’s not a SQL Server… it’s an Oracle 🙂
And yes… hence me being employed 🙂
I see this is an old post but am wondering if this is config is any better. 1) Backup the databases locally, using a compression tool like LiteSpeed 2)Backing up that locally created file to backup storage 3)From the backup storage, backing up to tape with something like NetBackup.
This is an interesting topic. Wish I saw this earlier.
I have a 60 gb database that is backed over the network share. During restore should i copy to the local drive and restore or I should restore directly from the network share. Please let me know which one is more effective
Ahmed – the only answer is to test on your systems.
Thanks for your reply….this is a new job for me and we have around 30 servers thats why thinking of a safe route.
Our backups are stored locally but I have SSIS packages that copy the backups off to a network share nightly. On the network share I keep a rolling 3 days of backups. I get an email daily of the backup results and how much free space is left on the network share and the local disk.
Great. That’s perfect because SSIS never fails or misses a job, and even if it does, your alerting system will always let you know within seconds, and you’ll be poised at the keyboard, ready to react and fix the problem. You’ll surely be able to get those files copied off before anything bad happens to the server.
I can’t imagine anything that could go wrong with this whatsoever.
The backups are copied to the network as soon as the local copy has been created. Of course, if the local backups or the SSIS jobs fail I get an email alert. It’s not perfect but it’s better now than it was.
That’s exactly what the next admin is going to say when she replaces your Rube Goldberg setup by simply writing the backup where it needs to go in the first place, reducing complexity and lowering RPO/RTO. 😉
Only saying this because I love ya.
Depending on network only backups introduces many more points of failure and also increases the chance of data corruption. I’ll stick with my local copy and use the network as a backup to my local backups.
Right, because the network only breaks and data only corrupts when SQL Server writes the backup, not when SSIS is copying it, and SSIS is so great at detecting corruption during file copies.
So if the network has a hiccup I still have a local backup. In your scenario you would not.
Jim – again, you’re just not seeing the big picture. A backup on local disk is useless when the machine fails. The goal isn’t to have a useless backup, but a useful one. Just as you’d have retry logic in your SSIS job, you have retry logic in a backup job.
My big picture includes a local and a network backup strategy. What happens when the network backup disks fail? You will not have a backup at all.
Jim – if the only thing that failed was the network backup, you still have the database.
If your entire storage system fails, your point is irrelevant. (You have to have backups in two places anyway.)
If you have disks that host both the database and your backups, if that one array fails, you lost everything. In my scenario, there’s no single point of failure.
“if the only thing that failed was the network backup, you still have the database.”
One of the reasons I would be looking for backups would if the DB is corrupt. So now I have no backups and a corrupt DB. Then I am looking for a new job.
The more backups I have in different places the better I feel. That’s just me, I guess I’m paranoid.
Jim – if you have database corruption, that’s most likely due to corrupt storage. The very same storage where your backups are written to.
So you just foiled your own argument.
Disk issues is a big reason but not the only reason. If I have a local copy I can just fire it up. If that doesn’t work I can get the network copy. I like having options. To each his own.
Jim – I’ve completely debunked every reason you’ve had here, but sure, to each their own. If you want to back up to corrupt local storage and then copy that corruption to a network share, sure. I suppose that’s an option. Best of luck on your journey.
Reading some of these replies makes me laugh at how archaic DB admins brains work compared to everyone else. Tell me, do we create “local” disk backups for all our important VMs? Or are we using backup tools like veeam, rubrik, cohesity, zerto, etc that backup directly to external storage?
What the DBAs don’t seem to understand is that with stupid local backup method you are hurting yourself. You are trashing local disks to store a copy of days next to production. Disk fails during your backup to local disk and you are no better off the failure writing over the network. But the network is only reading data from your production storage, not also writing to it. It also is wasting a bunch of production quality storage for a backup. It also means now you have backups stuck in your production data, which will ultimately now need to be processed by a real VM backup tool. And that tool will most likely already have the ability to create consistent SQL backups via application aware code. But now it’s also causing large expansion out the VM backup and increasing backup time. You’re probably also pushing those local disk backups to a NAS that also needs to be backed up. You’ve got backups of backups going on now and massive storage waste. And ultimately when a recovery is necessary you’ll be looking for a VM recovery, which if you just let those toils handle it the recovery would be exponentially quicker and disaster recovery would be easier too. You know that there is more running and configured on that SQL server then just SQL and it all needs recovery anyway.
At least with writing your SQL backups via SSMS or similar to remote storage directly you fix some of this, but ultimately backups really need to be handled in the backup tools by teams responsible for that instead of these ridiculous methods.
I have come into a situation where there is an SQL Cluster with 2 nodes. They backup locally and copy remotely if they are the primary node for that database. But it becomes complex during a failover because the second node would take over and start mirroring / copying it’s local backups which are out of step from the original node. It uses robocopy with purge to mirror the local backup files.
Backing up directly to the same network location from both nodes would remove this complexity and is likely the way I will go. Do you think this would be the best approach? Anything else we should be thinking about?
I just ran a test backing up 200gb uncompressed to a newly provisioned NVMe, and then compared that to a network share on 1gb nic to a machine with an identical drive on the same subnet. Local drive took 80 seconds. Network share took 36 minutes.
For reference, the same backup on my 2010 hardware took 21 minutes locally, so back then it was almost a wash.
Every test case is unique, but in 2022 drives are so blazingly fast, I think I’d always want to burn the backup to a local disk, and then robocopy it off the machine, both to a network share and offsite storage. A network hiccup over a 20-60 minute window is not entirely uncommon, and robocopy has fault tolerance and can retry failed transfers. Of course I suppose a 10gb nic would lop an order of magnitude off that network transfer, but even then I’m not sure that’d change my calculus.
I hear you, but I’ve seen *so many* cases where someone’s artisanally crafted backup copy script failed for days or weeks, and there was no error handling or notification built into that job, and they didn’t have backups available when the box disappeared.
When it comes to backups, don’t get fancy with the Rube Goldberg-esque “the backup goes to disk A, then a Perl script picks it up, it travels over a wire, it drops into a bucket” type stuff. Just keep it simple to maximize the chance of recovery, and minimize the chance of looking like the guy who tried to show off at the least appropriate place.