What happens to in-flight data compression in an emergency?

SQL Server
10 Comments

Data compression can have many uses and advantages, but it also has its drawbacks. It’s definitely not a one-size-fits-all strategy. One of the things to be aware of is that initial compression of a table or index can take quite some time, and will be resource-intensive. It also is an offline operation, so the object can’t be accessed while it’s being compressed or uncompressed. (Clarification: by default, an ALTER TABLE statement is an offline operation. You can declare it online, but, as David notes in the comments, “Although the operation is ONLINE, it’s not completely idiot-proof.”)

So what would happen if your SQL Server service or server restarted while you were in the middle of (or, as it usually goes, 90% of the way through) compressing an index? Let’s investigate.

I have a table that is 1.1 GB in size, with a 1.0 GB nonclustered index.

Table name Index name total_pages PagesMB
bigTransactionHistory pk_bigTransactionHistory 143667 1122
bigTransactionHistory IX_ProductId_TransactionDate 131836 1029

I need to reduce the size of the nonclustered index, so I decide to compress it. After using sp_estimate_data_compression_savings, I determine that I will benefit more from page compression.

I apply page compression, and allow my server to work.

Now, let’s say there is an emergency during this operation. Perhaps a component in the server breaks; maybe the data center loses power. (Or I restart the SQL Server service.) Uh-oh! What happened?

When the service is running, I check SSMS. I see the following error.

What happened to the data? Is compression all or nothing, or is it possible that some pages are compressed and others aren’t? I first check the index size.

Table name Index name total_pages PagesMB
bigTransactionHistory pk_bigTransactionHistory 143667 1122
bigTransactionHistory IX_ProductId_TransactionDate 131836 1029

Nothing has changed – the index is not one byte smaller. This tells me the operation was not successful.

I also check the error log to see what information it can provide.

error log rollback

There are 2 transactions that are rolled back. As I am the only person in this instance right now (the benefits of a test environment), I know those were my transactions.

SQL Server has treated my data compression operation as a transaction. If there is a restart at any point, the operation will be rolled back to maintain data integrity.

Previous Post
Are Table Variables as Good as Temporary Tables in SQL 2014?
Next Post
SQL Server 2014 Licensing Changes

10 Comments. Leave new

  • David Wentzel
    April 3, 2014 8:56 am

    I believe your object’s data is ONLINE if your ALTER is properly declared ONLINE, follows the rules for what can be rebuilt online, and you are running enterprise.

    We’ve run into problems during online index rebuilds where the data was online but other areas of the system were effectively “blocked”. We had an ONLINE index rebuild run for 40 hours before the process was killed by a DBA because the tran log was so large that it effectively shutdown the log reader agent due to blocking. Same with CDC and likely anything else that is tranlog-intensive. In one case we ran out of disk space because the tran log grew and consumed the entire LUN. Although the operation is ONLINE, it’s not completely idiot-proof.

    Anecdotally, with really HUGE tables it seems like the best strategy is to REORG the index before applying compression if the index you are compressing shows even a modicum of fragmentation. Since a REORG is “fully online” (and is killable without rolling back all of its work), the more pages you can release from the structure before you compress, the better.

    Great blog post…keep up the good work…thank you.

    Reply
    • Jes Schultz Borland
      April 4, 2014 7:29 am

      David, you are correct – by default, the operation is OFFLINE, but I could make it ONLINE. Also, “Although the operation is ONLINE, it’s not completely idiot-proof” may be my new favorite sentence! Thanks for reading!

      Reply
  • According to http://technet.microsoft.com/en-us/library/cc280449.aspx (and my experience), data compression can be performed as an online operation.

    Reply
    • Jes Schultz Borland
      April 4, 2014 7:29 am

      Yep, thanks for clarifying. It should say, “By default, it’s an offline operation.”

      Reply
  • Peter Wretmo
    April 3, 2014 11:53 am

    PAGE compression is a superset of ROW compression, hence it is expected that PAGE compression will give more compression than ROW compression.

    I would rather follow this white paper on how to decided whether to use row or page compression.

    http://msdn.microsoft.com/en-us/library/dd894051.aspx

    Reply
    • Jes Schultz Borland
      April 4, 2014 7:30 am

      Great link! I’ve seen cases where row compression saves 40% and page compression only saves another 10%; I’ve seen other cases where row gives 10% and page gives 40% over that. It’s very dependent on data types and what that data looks like.

      Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.