Vertical Partitioning Is Almost Never the Answer. Here’s Why.
You’re looking at a wide table with 100-200 columns.
Years ago, it started as a “normal” table with maybe 10-20, but over the years, people kept gradually adding one column after another. Now, this behemoth is causing you problems because:
- The table’s size on disk is huge
- Queries are slow, especially when they do table scans
- People are still asking for more columns, and you feel guilty saying yes
You’ve started to think about vertical partitioning: splitting the table up into one table with the commonly used columns, and another table with the rarely used columns. You figure you’ll only join to the rarely-used table when you need data from it.
However, you’re dreading the amount of code changes required. It’s gonna affect everywhere in your application that does inserts, updates, deletes, and selects. And at the end of the day, are you even sure it’s going to help performance? You don’t have the time to build a test version of the entire application with the new table structures.
Stop. Do index tuning first instead. A nonclustered index is literally a vertically partitioned copy of the table. You might even build an index with the commonly queried columns of the table – and find out that it’s being used for scans. Those scans will be faster than full table scans because the commonly-queried copy is simply physically smaller! The less columns you use for it, the faster those scans will go. Don’t do table changes that require app code changes before you’ve attended my Fundamentals of Index Tuning and Mastering Index Tuning classes.
Next, consider columnstore. SQL Server’s columnstore indexes store every column independently so that you can query just the columns you want. This can have spectacular payoffs for query performance, although not for all workloads or datatypes. Start by taking the free ColumnScore.com quiz to get a rough idea of whether your table is a good fit for columnstore indexes, and if so, check out my Fundamentals of Columnstore class.
Index changes don’t require code changes. Start there, and only after you’ve exhausted those easy wins, should you consider vertical partitioning.
Some of my posts are long and involve detailed code examples. Not this one, though, because the answer’s pretty straightforward. Vertical partitioning isn’t just hard work for you, it’s hard work for everyone around you. Do the easy stuff first, then do the stuff that might seem hard for you, but doesn’t require work from other people. Assigning yourself AND other people hard work should be your very last resort.
Related

Hi! I’m Brent Ozar.
I make Microsoft SQL Server go faster. I love teaching, travel, cars, and laughing. I’m based out of Las Vegas. He/him. I teach SQL Server training classes, or if you haven’t got time for the pain, I’m available for consulting too.
Get Free SQL Stuff
"*" indicates required fields

10 Comments. Leave new
Short, sweet and to the point. Vertical Partitioning = Endless woes.
“A nonclustered index is literally a vertically partitioned copy of the table.”
Gold.
Also the index can have include columns that within reason can be adjusted for any other columns that aren’t part of the key and are needed. (BLOBS and LOBS not withstanding.)
The “within reason” needs to be respected, dammit. – I’m channeling Gumby.
Index changes don’t require code changes… generally, but hey look at our app! *triple backflip facepalm*
Not trying to be condescending but isn’t that one of the first things most DBAs usually check for whenever there are slow running queries? Are there any table scans? Can I optimize the query or add indexes? Even semi regular review of the indexes using the first responder tool kit should regularly uncover tables that have high scans and need some extra index love.
This is genuine question. I haven’t had an issue like this in my db career since I started doing regular index maintenance (outside of the rare columnstore index use case). I don’t know why any DBA would want to fight the fight to rebuild parts of the app and involve devs before doing the basics.
You did say “almost never”. 😉
I find vertically partitioning tables with large columns, such as email body or JSON packets, can be helpful, especially in OLAP environments; you don’t need to move as much data around for INSERTs and UPDATEs.
Bold of you to assume there are only 200 columns.
“Cannot create a row of size 8071 which is greater than the allowable maximum of 8060.”
(we decided we don’t really need 20 digits for those not null columns, 18 is enough and moves the problem a few years ahead)
[…] Vertical Partitioning Is Almost Never the Answer. Here’s Why. (Brent Ozar) […]
[…] Brent Ozar lays out an argument: […]
Personally, I would marvel at a 200 column table that was actually normalized properly. I can see plenty of possiblities for it, but I also expect most of the cases are just a mess and should be split for logical, not physical/performance reasons.
The ACDOCA (Universal Journal) table in SAP has 489 columns. It’s also a columnstore table.
https://community.sap.com/legacyfs/online/storage/blog_attachments/2022/01/Amounts.png
To the right and slightly below the the “index” tab is the column count.
There may be ones with more columns but that’s my first stab and I’m away from my work computer.
There are also “table pools” in earlier versions and I won’t go too much in detail other than to say normal SQL cannot be run against them easily.