Does Bit Column Order Matter in Tables?

Development
3 Comments

At the PGConf.dev, where Postgres developers get together and strategize the work they wanna do for the next version, I attended a session where Matthias van de Meent talked about changing the way Postgres stores columns. As of right now (Postgres 17), columns are aligned in 8-bit intervals, so if you create a table with alternating columns:

  1. MyBitColumn1 – 1 bit used
  2. (7 bits wasted for alignment to get to the next byte)
  3. SomeOtherColumn – any other datatype, but not a bit
  4. MyBitColumn2 – 1 bit used
  5. (another 7 bits wasted for alignment)

Matthias pointed out that was inefficient, and that Postgres should separate physical column order from logical column order. Under the hood, it should just store:

  1. MyBitColumn1 – 1 bit used
  2. MyBitColumn2 – 1 bit used
  3. (6 bits wasted for alignment)
  4. SomeOtherColumn – any other datatype

Microsoft SQL Server already does this with bits, as the documentation explains:

The SQL Server Database Engine optimizes storage of bit columns. If there are 8 or fewer bit columns in a table, the columns are stored as 1 byte. If there are from 9 up to 16 bit columns, the columns are stored as 2 bytes, and so on.

To demonstrate that, I whipped up a demo script showing two table creations: one with the bit columns scattered around through the table, and one where they’re all grouped together:

sp_BlitzIndex shows that both tables have the same size:

If we add another bit column to both tables:

Size still remains the same:

Because SQL Server’s just making a metadata-only change, noting that 1 of the 8 bits in the bit-designated space is now available for use by the new Bit6 column. To really drive that point home, let’s go back and update the new Bit6 column to be 1:

And then check the space used again:

Yep, still 194MB. Good work, Microsoft.

The more I learn about Postgres, the more I appreciate so many little things that Microsoft has done over the years for performance & space optimization. The one that’ll really surprise you is that Postgres still doesn’t have table or index compression yet, although it does offer value-level compression.

Previous Post
Updated First Responder Kit and Consultant Toolkit for July 2024
Next Post
Save $350 on the PASS Summit if You Register Now.

3 Comments. Leave new

  • I’ve had a few arguments with oracle and pg DBAs about use of boolean properties (they were all opposed) – I wonder if this is what their opposition is based on. I think they are great for filtered indexes and for statuses that can only ever be in two states that devs won’t be tempted to extend to multiple statuses later and have a very confusing property name

    Reply
  • Mark Freeman
    July 2, 2024 8:23 pm

    I don’t know if newer versions are better, but Oracle at least as recently back in the 10g days needed the physical ordering of bit columns to be done manually, which was sad given their pricing when Microsoft handled it automatically, even back then.

    Reply
  • […] Brent Ozar performs an experiment: […]

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.