SQL Server Data Compression: It’s a Party!

Last Updated 13 years ago

When I was in high school, Dad and I lived with his mom, my Grandma Ozar, for a couple of years. We took care of things around the house and made sure her coffee pot was always full. She could really down that coffee – at least two pots a day. (Looking back, if we could have reduced her caffeine consumption, she probably wouldn’t have needed so much Valium.)

Grandma and some friends took a road trip to Las Vegas, and while they were gone, I threw an epic party. We’re talking 30 gallon garbage cans filled with homemade Jungle Juice. (I’m one of those reasons she couldn’t completely eliminate the Valium.) A couple hundred of my closest friends had a good old time.

A couple of my less-than-closest friends had a little too good of a time at my expense and started trashing the house. They walked up the staircase smashing the picture frames of every family photo, then started to throw a couch off the second floor balcony. My security guys (I’m telling you, it was that good of a party) carried them out before they got too carried away.

Cleaning Up After The Party

The next morning, the very-closest-friends did a fantastic job of getting things back to normal. We replaced all the picture frame glass, got the dirt out of the sofa, vacuumed the place top to bottom, and finished the Jungle Juice. I thought we’d hidden all our tracks, but we got busted by the tiniest of clues.

Someone had left beer bottle caps on top of door jambs all around the house.

Amazing! Who thinks of this stuff? You know they did it on purpose, too – they were just itching to get me into trouble. One beer cap on one door jamb, I could understand, but all over the house? Damn.

I was disinherited for that particular shindig.

I learned a valuable lesson: if you’re not absolutely sure you can clean up every trace of everybody else’s messes, you shouldn’t throw parties. Sooner or later, somebody you can’t trust is going to show up at your party, and they’re going to do something that’ll get you in trouble. Come to think of it, it’s just like being a DBA.

SQL Server 2008 Data Compression: No Inheritance Either

Microsoft SQL Server 2008’s Data Compression feature lets you compress objects – not just tables, but individual indexes. This compression does incur a little extra CPU power to handle the compression, but that extra overhead is more than offset by increased IO performance. Generally speaking, the database server is sitting around waiting on disk subsystems. Adding a little CPU work while dramatically reducing IO needs results in faster query return times. You need to test compression to see if it works well in your environment, because it may not work well in heavy-insert databases.

Today, though, I’m going to focus on the dark side of compression: a complete lack of inheritance.

When you compress tables and indexes, it’s a one-time action. You’re only taking care of what exists today. If someone (or even you) turns around and creates an index on that same table tomorrow, it won’t be compressed by default. Whoever creates the index has to make sure that it’s compressed, and there’s nothing in SQL Server Management Studio that will hint to them that other parts of that same object are compressed.

To make matters worse, your development, QA and production environments might all have different compression settings, and you’d never notice it at a glance. Compression is transparent to applications, so your developers won’t know why one environment performs much differently than another even though they have the same hardware, same indexes, same statistics.

Implementing compression is a multi-step process:

Figure out what objects you should compress
Plan to handle all of your environments (dev, QA, production)
Compress them during a low-activity window
Regularly patrol your environments checking for added objects that weren’t compressed
Keep your environments in sync

If you don’t stay on top of all of these, you’ll need Valium too.

* – Yes, about that picture. No, it’s not mine. Yes, it’s licensed with Creative Commons. No, it wasn’t even the worst jungle juice picture I could find licensed with Creative Commons. Yes, I too am amazed that people upload their party pictures to Flickr, let alone license them with Creative Commons.

Blog Better Week: Strunk & White’s Elements of Style

SQL Server 2008 R2 Frequently Asked Questions

58 Comments. Leave new

David Stein
August 31, 2009 9:22 am

Would it be feasible to schedule semi-regular decompression and recompression times to “clean up” these stragglers?

Is there a basic DB size range where you would recommend that a DBA look at compression?

Reply
- Brent Ozar
  August 31, 2009 9:24 am
  
  About the decompression/recompression times – no, actually, you just need to run queries regularly to see what objects aren’t compressed, and then decide whether or not to compress ’em. Unfortunately, there’s no built-in tools to do that – the DBA has to roll their own.
  
  I would recommend that anybody using SQL Server 2008 Enterprise look at compression, regardless of the size. If you care enough to spend extra for Enterprise Edition, you should check out that feature, because it can be free performance for no extra cost.
  
  Reply
  - David
    February 4, 2022 4:50 am
    
    What wrong ran out of meomery or DATA.
    
    Reply
Joe Webb
August 31, 2009 10:17 am

What a great way to start the blog post! Well done, my friend, well done. Who said data compression had to be a dry subject. hahahaha….

Reply
- Brent Ozar
  August 31, 2009 10:18 am
  
  Thanks, sir. Turns out all these lessons I learned in my youth still help me to this day.
  
  Reply
Dave Schutz
August 31, 2009 7:43 pm

Too bad you have to by Enterprise version to get compression. You know I had a few parties like that many years ago. I remember it’s hard to hide holes in the wall also.

Reply
Hugo Shebbeare, SQL Server MVP
April 12, 2010 3:33 pm

Great points! Didn’t think about inheritance. I wrote on this subject with several code examples on SSC last year, with even a presentation given in Vermont, last Spring.

It was early 2009 I posted, after a great project in previous Spring of 2008 using SQL 2005 Entreprise Ed.’s VarDecimal conversion – highly effective if you are still in SQL 2005 and your databases are heavy financial decimal data type dependant.

If you are using 2005/8 Developer Edition, you have all the features of Enterprise, therefore you can compress with ease – u just cannot use Standard Ed. for this function.

More code examples for SQL Server 2008.

Reply
- Brent Ozar
  April 12, 2010 3:36 pm
  
  Hugo – yep, and you also have to make sure you don’t have Standard Edition in your disaster recovery environment. That makes for an ugly surprise when you try to recover from a production outage only to find you don’t have the right edition available, and you can’t restore your databases. Whoops!
  
  Reply
Paul
May 19, 2010 8:51 am

At what cost is compression ?

Being in this field for a number of years, there is one thing I learned “You get nuttin’ for free, there is always a cost”. I am ready to migrate one of our risk systems to 2008 from 2005, and compression looks like gold (gold, Jerry gold – Banya), especially for some of the huge, flattened out fact tables that are in the system, I’m talking a 7K row. I did not realize that you can use BOTH row and page compression on a table.

Reply
- Brent Ozar
  May 19, 2010 8:56 am
  
  Paul – howdy, sir. The cost is usually heavy on highly written tables. If something’s being constantly inserted/updated/deleted, the CPU cost might overcome the IO benefits. The best scenario is large tables that get rarely updated, like data warehouse fact tables.
  
  Reply
Rudy Panigas
April 19, 2012 2:34 pm

Nice article, thanks for the information!
Do you know where I can more information on compression and decompress for SQL 2005? Our vendor is recommending using compression and I’ve not used this feature before.

Here are some of the questions I need to understand first.
Does compress work when you are doing a select from one compressed table to another compressed table, or from a production server to a development server?
Can you revert back to an uncompressed database?

Thanks again for the great website! Always lots to read and discuss here 🙂

Reply
- Brent Ozar
  April 20, 2012 6:53 am
  
  Rudy – compression is totally invisible to queries. It only affects how the database engine writes stuff to disk. You can indeed revert by taking away the compression, but keep in mind that databases with compressed objects can only be attached to SQL Server instances running Developer or Enterprise Edition.
  
  Reply
Rudy Panigas
April 20, 2012 6:59 am

Thank you Brent! One last comment, is there any reason why you wouldn’t use compression on a database?

Reply
- Brent Ozar
  April 20, 2012 7:00 am
  
  Yes, there’s several. If my disaster recovery server is Standard Edition, if I’m dealing with very frequently updated data, if I’m dealing with data that doesn’t compress well, etc. You want to test it out in each case and monitor it closely to make sure it’s solving a problem – like any other feature, don’t implement it unless it’s the best solution to a problem you’re trying to solve.
  
  Reply
Rudy Panigas
April 20, 2012 7:13 am

Once again, Brent (Bacon Man) Ozar saves the day!

Thank you for all your great information and help!

Reply
Gilbert Gamache
October 15, 2013 1:45 pm

In our production environnment, our databases are not compressed. But we are looking to compress them in our (multiple, virtual) developpment environnement. Did someone ever tried that configuration, would they recommend it.
I see problem with Database Project and also with performance comparison (although not an issue in development)

Reply
- Brent Ozar
  October 16, 2013 5:32 am
  
  Gilbert – well, what problem are you trying to solve?
  
  Reply
  - Gilbert Gamache
    October 16, 2013 8:02 am
    
    We want to reduce our development footprint. With over 500 virtuals machines (each developer has his own farm of 3-5 machines), most of the disk space used by data.
    
    Reply
    - Brent Ozar
      October 16, 2013 8:47 am
      
      OK, gotcha. In that case, you’re talking about managing database compression across many VMs. I’d rather manage compression at the VMware storage layer.
      
      Reply
Matt
September 5, 2014 1:11 am

Thanks for the entertaining article Brent! a bit different from reading BOL. One question, after applying compression to table are newly inserted rows compressed like the existing data, or will compression need to be replied at some stage in the future to pick up these new rows?

Reply
- Brent Ozar
  September 5, 2014 10:08 am
  
  Matt – thanks! You’ll want to dig into BOL for details – it’s in the When Compression Occurs section here: http://msdn.microsoft.com/en-us/library/cc280464.aspx
  
  Reply
  - Francisco Tapia
    May 21, 2015 4:29 pm
    
    Brent,
    thanks for the informative articles. One question came up the other day, I keep reading that compression simply compresses the data in the index and table, I don’t see anything anywhere that also “rebuild/reorganizes” the index in the same processes. Is it mis-information to think a compression will also incur a rebuild of your indexes in the process?
    
    Reply
    - Brent Ozar
      May 23, 2015 11:24 am
      
      Francisco – if you’re enabling compression in order to perform an index rebuild, I’d humbly suggest that you just rebuild an index if that’s what you need.
      
      Reply
      - Francisco
        May 26, 2015 10:41 am
        
        Brent,
        I did find a line of text from one of the MS articles outlining that performing a database compression will also rebuild the indexes in the table. however, the reason the current dba did this was because there are thousands of tables in this vendor db. Reading your other articles on performance makes me think there are several tables he may wish to pull out of compression for speed alone. btw, this database was over 2tb before the compression.
Kyle
October 14, 2015 2:40 pm

How are columnstore indexes affected in SQL 2012? I see that they added extra compression features for 2014+ versions but not seeing any mention of 2012. Does the columnstore just use its own compression? Thanks!

Reply
- Kyle
  October 22, 2015 12:52 pm
  
  Finally see the light lol… since there is no inheritance it’s not like the compression would “trickle down” to a nonclustered columnstore anyway. On the other hand, a clustered columnstore in 2014+ would get compressed with a table perhaps? Am running up the learning curve on Enterprise features…. 😉
  
  Reply
Darek
February 12, 2016 6:40 am

Hi Brent.

If I may… If you have a big table or a big index with only integers in it, would it make sense to apply the page compression option to it? Or row compression would be more sensible?

Thanks.

Best,
Darek

Reply
- Brent Ozar
  February 12, 2016 6:41 am
  
  Darek – I’ll answer that with a question. Is there a command you can use to estimate compression on a table to see which compression makes the most sense?
  
  Reply
Donald R. Cavin
February 19, 2016 10:18 am

Ok, got a weird one Brent. @@version (10.0.5520.0, yeah I know old as the hills, not most recent sp, let along cu, mea culpa)…

…but…

…have you ever seen page compression (clustered pk only index) cause dupicate key violations on insert?

…i started page compressing to speed things up with little RAM…all of a sudden random compressed tables started having dup pk errors…I removed compression…the errors went away…so far on my searches, I have found doodly squat…I’m still working on it though…it makes no sense…

Reply
- Brent Ozar
  February 19, 2016 11:53 am
  
  Donald – no, sorry. I’d probably start with troubleshooting the basics – making sure it’s not actual duplicate primary key errors, or race conditions in the code. Beyond what I can troubleshoot in a blog post comment though.
  
  Reply
  - Donald R. Cavin
    November 9, 2016 8:10 pm
    
    Brent. Follow-up…it was bad code. Page splits using select distinct (nolock) with guid as pk from very active oltp system…we got duplicates from the source. Developer had to recode and I got to reapply data_compression=page and have FABULOUS performance again. Thanks again.
    
    Reply
John Penn
March 25, 2016 1:55 pm

Hi Brent-

Why does sp_BlitzIndex categorize compressed indexes as “Abnormal Psychology”? Is it simply alerting to their presence, or has it somehow determined that those indexes shouldn’t be compressed?

Reply
- Brent Ozar
  March 25, 2016 2:07 pm
  
  John – yep, just reporting that there’s something very unusual in the database.
  
  Reply
John
June 16, 2016 11:05 am

Brent, do you see any problems migrating from 2008 to 2012 with tables that are compressed?

Thanks

Reply
- Brent Ozar
  June 16, 2016 11:06 am
  
  John – no, I haven’t, but there’s a simple test: just build the 2012 dev server you’re going to use, and restore the database there.
  
  Reply
  - John
    June 16, 2016 4:53 pm
    
    Very cool. Thank you Brent.
    
    Reply
Flávio Germano
August 9, 2016 6:13 pm

Is it possible to compress a partitionated table ?

Reply
- Erik Darling
  August 10, 2016 6:00 am
  
  Absolutely. You can even use different compression on different partitions.
  
  Reply
  - Flávio Germano
    August 10, 2016 7:33 am
    
    Wow, that’s nice ! Thank you for you answer.
    
    Reply
Jeremy Bennett
August 16, 2016 2:36 pm

When you run the compression wizard or alter command like on a table’s only partition (not indexes) does the object compress before the statement completes? if the statement comes back with command completed is it done? I tried it and ran some tsql to check the space the rows take up and nothing changed. thanks in advance.

Reply
Anchelin
October 26, 2016 11:43 am

Hi Brent. Thank you for the brilliant article as always. My question regarding index compression is; how do you monitor/measure how the index gets decompressed in memory. I know that the whole page/row (compressed) is read into memory and only the data that is being queried gets uncompressed, but I am not sure what to look for showing this behavior.
Please…?

Reply
- Brent Ozar
  October 26, 2016 11:51 pm
  
  Anchelin – what’s the problem you’re trying to solve, or are you just curious? If you’re just curious, go pick up a book on internals.
  
  Reply
vanrani
October 4, 2017 1:24 pm

Hi Brent,

I know it is old post, but wondering what is your thought about SAN storage compression VS SQL server data compression

Reply
- Brent Ozar
  October 4, 2017 1:26 pm
  
  Vanrani – can you be more specific?
  
  Reply
vanrani
October 4, 2017 2:08 pm

Yes, here is the story. We have SQL 2012 database that app team did mostly page level compression. They ran their performance test and noticed CPU hike. So requested to add more CPU. So our infrastructure team came back saying that we are already doing compression on the SAN level, so there is no need to do DB compression at SQL level.
My question is does SAN level compression really does compression at the mdf file? I know it does for file server.
Thanks Brent

Reply
- Brent Ozar
  October 4, 2017 2:15 pm
  
  Vanrani – let’s take a step back. What problem was the app team trying to solve by enabling page level compression?
  
  If their primary goal was to reduce space, I’d start by using sp_BlitzIndex to identify which indexes are consuming the most space, make sure they’re actually being used, dedupe nearly redundant indexes, etc.
  
  If their primary goal was to improve performance, and then their performance tests noted that their performance went down, I would tell them to roll the compression back, go back to their original settings, and measure their primary wait type.
  
  Reply
vanrani
October 4, 2017 10:14 pm

Thanks Brent, we are going to do exactly what you said. Roll back the compression and run the test again. I will try sp_BlitzIndex. Thanks!

Reply
Marvel Mayfield
October 29, 2018 1:25 pm

Reviving an older post to ask a new question. I’ve seen very little posted on which type of compression is the right type of compression. The only clear statements I’ve seen have been by Thomas l Rock, who says PAGE level is best for tables with lots of writes while ROW is better for tables with lots of scans. I’d be interested in hearing your thoughts on this approach.

Reply
- Brent Ozar
  October 29, 2018 1:27 pm
  
  Marvel – sure, for free answers to questions, head on over to a Q&A site like https://dba.stackexchange.com, or join us during our weekly Office Hours podcast: https://www.brentozar.com/office-hours-podcast/
  
  Reply
StewartWB
October 2, 2019 7:30 am

Brent – thanks for the post!
The discussion is even more relevant today, since Page / Row compression is available in Standard Edition starting with SQL 2016 SP1.
With regard to the questions about SAN compression vs. SQL compression: both will reduce disk I/O when reading from disk, but SQL compression will also increase the amount of data that remains in the servers RAM buffer pool, since the table/index data remains compressed in memory.
If the DBA team can collaborate with the SAN team, I’d suggest they disable SAN compression and let the DBAs manage it in a more granular fashion. They should also test performance for compressed SQL objects on compressed SAN volumes vs. compressed SQL objects on uncompressed SAN volumes. It may be worth the complexity for the SAN team to provide compressed and uncompressed volumes to the DBA team, and the DBAs could create file groups on either (compressed SQL objects on uncompressed SAN volumes, and vice-versa).

Reply
AJ
August 1, 2020 7:05 am

Hi Brent.

Does this mean that incorrect datatypes are no longer an issue? Choosing bigint over int(the appropriate one) is not going to end up occupying more space if we have data_compression which would compress it accordingly to its value?

Reply
- Brent Ozar
  August 3, 2020 3:31 am
  
  AJ – no, they’re still an issue due to memory grants. https://www.brentozar.com/archive/2017/02/memory-grants-data-size/
  
  Reply
Rahul
February 10, 2021 6:24 pm

Hi Brent,

I am going to implement page level data compression in all Database tables, could you please confirm, is it one time activity? Or do I need to schedule it weekly basis ?
After doing data compression, will it be also apply for upcoming or New data ?

Reply
- Brent Ozar
  February 10, 2021 6:26 pm
  
  Rahul – for general questions, head to a Q&A site like https://dba.stackexchange.com or https://sqlservercentral.com.
  
  Reply
David
February 4, 2022 4:53 am

if i was at the party then i know wgho tried to nick your house prehap if i said vasalu you may know them.

Reply
- Brent Ozar
  February 4, 2022 2:04 pm
  
  I think someone nicked your keyboard and replaced it with a random number generator.
  
  Reply
David
February 4, 2022 5:01 am

Generall;y data compressing is only use to save memory.
ory. or did you need some more beer.

Reply
- Brent Ozar
  February 4, 2022 2:03 pm
  
  Err, no, data compression is also used to save storage space.
  
  Reply