SQL Server Tricks: How Can I Restore a Subset of My Tables to A Prior Point in Time?

This question came up when working with a client last week. The team mentioned they were planning to move some tables into a new filegroup. The new filegroup is intended to help quickly restore data in cases when a monthly process goes off the rails and leaves the data in some tables in a bad state.

Over the years I’ve worked on a lot of backup and restore plans with SQL Server, and I learned the hard way that filegroup level restores don’t do exactly what you might think.

DBZilla Attacks!
What if he only eats a FEW tables?

In this post, we’ll talk through a sample problem. I’ll explain while filegroup level backups don’t fit the bill to solve this problem, and then talk through a couple of alternatives that may make life easier.

Our Problem: We Need to Restore the Data for Some of the Tables in a Database

Let’s say you manage a database named ImportantDB. Some tables are updated daily in ImportantDB, and other tables are only updated once per month. The tables updated once per month are updated by a legacy application and you don’t have access to the source code.

Periodically, the monthly load will fail due to issues outside of SQL Server. When this occurs, the recommended procedure is to reset the monthly tables to the state they were in prior to the monthly load job beginning, and restart the process. (Essentially, you start over.)

Restoring the entire database takes a long time, and might remove changes to the “daily” tables as well.

Attempt 1: Filegroup Level Backups

In SQL Server we have the ability to back up and restore individual filegroups. In Enterprise Edition, SQL Server also has the ability to have the Primary and select other filegroups online and responding to queries while you are restoring individual filegroups.

For situations like this, it seems like filegroup level backups and restores will be the answer to our problem. The plan we make looks like this:

  • Move the monthly tables into a filegroup named MONTHLYTABLES
  • Set the filegroup as READ_ONLY

Whenever we need to load data into the filegroup, we plan to:

  • Take a filegroup level backup of MONTHLYTABLES
  • Set the MONTHLYTABLES filegroup as READ_WRITE
  • Start the process to change data
  • If there are any issues with the data load, restore over the filegroup from the READ_ONLY backup

Why Filegroup Level Restores Don’t Solve This Problem

The restore sequence described above in SQL Server won’t work, by design. SQL Server’s mission with filegroup level restores is described in Books Online here:

Each filegroup-restore sequence restores and recovers one or more offline filegroups to a point consistent with the database.

As soon as the MONTHLYTABLES filegroup is made READ_WRITE, we can no longer restore over it with a backup made from a prior point in time and bring it online. We can perform the filegroup level RESTORE if we’re in the appropriate recovery model, but in order to bring it online we also have to restore additional log backups to bring it to a point in time that is consistent with the rest of the database. (Don’t believe me? Check out Tibor Karaszi’s fine blog on the topic with sample code. Tibor’s examples are for the full recovery model— play around with things and you’ll see that you can’t make it work in the simple recovery model, either.)

Why does it have to be this way? It’s because SQL Server doesn’t track transaction status by filegroups, and its mission is to ensure transactional consistency within a database. If we could restore individual filegroups to different points in time and bring everything online, we wouldn’t have a way to ensure consistency.

Alternative 1: Database Snapshots

If we’re running SQL Server Enterprise Edition, we can automate a solution to our process by using a database snapshot. Our monthly load procedure would look like this:

  • Create a database snapshot
  • Start the process to change data
  • If there are any issues with the data load, either:
    • Revert to the snapshot (this will impact the whole database)
    • Truncate and reload the monthly tables using the snapshot as a data source

This solution will work, but depending on the size of the monthly tables, it may have some performance problems. The database snapshot will use IO for all the data changed in the monthly tables. If the monthly tables are very large, we will end up spending a lot of writes against our snapshot file as well as the database we are repairing.

If it’s OK to lose any other data that has changed since the snapshot was taken, the revert option may work for us— provided that we don’t mind dropping any full text catalogs and rebuilding the transaction log.

The only time I really like a database snapshot option is when I’m 100% sure that other processes have not been updating data. That just isn’t the case for a lot of databases, so we may need to seek other options.

Alternative 2: Views

We could also solve this problem by moving the monthly tables into a new database named ImportantMonthly. Then, we would replace the original tables in ImportantDB with views. A simple view of “SELECT [columnnames] from ImportantMonthly.schema.table” would be put in place for each table.

ImportantDB is safe... and possibly has a friend.
Important DB is safe... and possibly has a friend.

With the views defined in this way, they would function just like tables and allow insert, update, delete, and select statements to be run against the base tables.

This would allow the process to become:

  • Take a full database backup of ImportantMonthly
  • Start the process to change data
  • If there are any issues with the data load, restore over the ImportantMonthly database

The main gotcha to be aware with using views in this manner is that if the definition of the tables in ImportantMonthly changes, you may be required to refresh the metadata for the views with the procedure sp_refreshview.

Alternative 3: Synonyms

Similar to the view solution, we could move the tables into ImportantMonthly, and then replace the original tables in ImportantDB with synonyms.

Synonyms in SQL Server don’t have the same metadata issue as views, but there is one important thing to know: you cannot reference a synonym that is on the other side of a linked server.

The Devil is in the Details

This example highlights one thing to me: when you try to solve a problem with just one kind of tool, you have to make compromises. To build a strong, reliable solution and mitigate the risks, database administrators need the collaboration of whomever controls the application code for the database, whether that be a vendor or an in-house development team.

Previous Post
Wanna Get Your FreeCon This Friday?
Next Post
Storage Protocol Basics: iSCSI, NFS, Fibre Channel, and FCoE

7 Comments. Leave new

  • There are some great 3rd party tools like Idera’s SQLSafe that allow you to restore to any point-in-time – depending on your backup/retention policy of course. As long as you have a continuous chain of backups, you are good to go. I use point-in-time restores all the time and while I was skeptical to use a 3rd party tool (used LiteSpeed in the past), I love Idera’s SQLSafe.

    Point being, you could restore the entire database to a point-in-time and peel out the tables you need, or then use some of your suggestions to tie into them and get what you need.

    Reply
  • Subhash Pant
    May 17, 2012 1:10 pm

    That is one of the myths that I had in mind too although I had never tried it. Thanks for this post! 🙂

    Reply
  • Great post! I have tried the alternative 2, but never think of the other alternatives. Thanks Kendra for sharing this.

    Reply
  • Adrian Facio
    May 21, 2012 11:44 am

    I like alternatives 2 and 3, they provide decoupling in my point of view.

    Reply
  • One day Microsoft will invent flashback =)

    Reply
  • Hi Kendra! I really liked this blog post since I’m approached with this situation somewhat frequently by developers, but in my case it’s because someone forgot to put a WHERE clause on their DELETE statement and things like that. What I have done in the past is restore the last good backup to a development server and then transfer the needed tables from dev to production. What would be the negative side effects of this? I’m assuming you didn’t include this as a solution in your blog post for a specific reason, just curious as to why.

    Also, thanks for all you do for the SQL community. You and Brent and Jeremiah (and now Jes) have been so helpful to me in my first DBA job. There are so many good SQL blogs, but this blog is my favorite because of the breadth of topics and the webcasts. Keep up the awesome work!

    Reply
  • Depending on what data sets you are populating those tables with. You could partition your tables and in case of failure just dump the data within the new partition.
    Data could be partitioned on a going forward basis. ie.Slice by Month/or Daily etc..
    Create Load tables which are on the same partition scheme/function as your ImportantMonthly tables

    Truncate LoadTables
    Load the data on the partition Key “ImportDate” or relevent field into LoadTables
    “Check process ”
    “finished correctly” – Switch Partitions From LoadTables ->ImportantMonthly “switch operation is just changing the data pointer so in ms time”
    “Failed to finish” – Restart Load process “logic already handles existing partition data

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.