Index Rebuilds Make Even Less Sense with ADR & RCSI.

Index Maintenance
6 Comments

Accelerated Database Recovery (ADR) is a database-level feature that makes transaction rollbacks nearly instantaneous. Here’s how it works.

Without ADR, when you update a row, SQL Server copies the old values into the transaction log and updates the row in-place. If you roll that transaction back, SQL Server has to fetch the old values from the transaction log, then apply them to the row in-place. The more rows you’ve affected, the longer your transaction will take.

With ADR, SQL Server writes a new version of the row inside the table, leaving the old version in place as well.

Because you’re a smart cookie, you immediately recognize that storing multiple versions of a row inside the same table is going to cause a storage problem: we’re going to be boosting the size of our table, quickly. However, the problem’s even bigger than that, and it starts right from the beginning when we load the data.

ADR Tables Are Larger From the Start.

We’ll demo it by creating two databases, Test and Test_ADR. I have to use different databases since ADR is a database-wide setting. Then, I’ll create two test tables, Products and Products_ADR, and load them both with a million rows.

The end result looks like this:

Random foods

Let’s compare the sizes of the two tables using sp_BlitzIndex. The first result set is Products (the normal table), and the second result set is Products_ADR.

Initial sizes

The clustered and nonclustered indexes on the Products_ADR table are all larger because like Read Committed Snapshot Isolation (RCSI), ADR needs to add a timestamp to each row to track its version. That timestamp must take up extra space, and that’s the reason, right?

Well, not exactly – rebuild the indexes on both tables and watch what happens:

The results:

After index rebuild

The ADR version of the database plummets in size down to match the non-ADR version. That’s… odd to me. If I had an unlimited amount of time, I’d be curious to find out why that is, but I’m just a man standing in front of a database, asking it to love me, and I’m on the clock, so I gotta move on to the real problem.

I can understand why folks historically said, “After you turn on RCSI, you should rebuild your indexes to reduce space lost to page splits.” That isn’t what happened here, though – ADR was already on at the time we loaded the data, so we shouldn’t have had page splits to add versioning data. The versioning timestamps should have gone in with the initial inserts. Really odd.

I’ve also repeated this demo with additional databases with both ADR & RCSI, and just RCSI, complete with rebuilding the indexes after the load so everyone’s on the same starting point. To keep this blog post short, I’ve omitted the demo code, but you can download the full demo code here.

Sizes by feature

The results all have similar sizes:

  1. No features enabled
  2. ADR on
  3. ADR and RCSI on
  4. RCSI on

However, things start to change when the data does.

ADR & RCSI Tables Grow Really Quickly.

Let’s update 10% of the rows in all of our tables:

And then check their sizes:

After ten percent update

In the “normal” database, the object sizes remain pretty similar because SQL Server was able to update the rows in place. The nonclustered index on QtyInStock grows because about 10% of our rows are changing from 1 to 2, so we’re going to have to move them to their new sort order, which will require new pages.

On the other hand, in the databases with ADR & RCSI enabled, the object sizes exploded, nearly doubling.

If we update another 10% of the rows:

And check the sizes again:

After twenty percent update

All of the databases see growth on the QtyInStock indexes as values move from 1 to 2, different places in the b-tree, which is going to cause some new page allocations. But the clustered indexes remain the same. We already had an explosive amount of page splits there that left a lot of empty space behind, and we’re able to reuse that space for our new round of updates.

Let’s try several rounds of updates – this is gonna take a while:

And then check our sizes:

After many updates

In the ADR & RCSI databases, the clustered index has stabilized at around twice the size of the “normal” database. Is that a problem? Well… in most folks’ eyes, yes. They see this larger table as wasted space, and they want to fix it.

You Can “Fix” That With Index Rebuilds. Don’t.

To fix table bloat caused by old row versions sticking around, just rebuild the indexes:

And the objects drop back to their initial small sizes, same as right after our data loads:

After index rebuild

People get all excited, saying they’ve “saved” disk space – but guess what’s gonna happen as our user workloads start up again, updating just 10% of the rows:

ADR and/or RCSI are both right back to double the clustered index size:

After continued activity

Summary: What’s the Problem That You’re Trying to Solve?

If you’re still rebuilding indexes like it’s 2005, thinking that you’re accomplishing something, stop. Time has moved on, and best practices have moved on.

You have to zoom out and rethink about what the real problem is. You’re not really saving disk space, because it’s going to blow right back up again in no time. You’re still going to need that disk space because users gonna use.

I’m not saying you should never do index rebuilds – there are absolutely cases where you should. A good example is when rows go in with a lot of null defaults, but then over the next several minutes/hours, those nulls are populated in, and then the rows are never touched again. In that case, your older pages don’t need the extra empty space that resulted from all those early-on page splits, and an index rebuild will help cram ’em back in tightly.

In. most cases, though, if you think you’re saving space, and you’ve turned on ADR or RCSI, your index rebuild space gains are illusionary and temporary.

Update 2025-08-19: Fill Factor Helps Here Too

There was a discussion on LinkedIn about this post, and Vedran Kesegic wrote that setting fill factor to 90% would help here. Leaving 10% free on every page was enough to let SQL Server write ADR’s version information to the same page. Only the smallest index jumped in size (by 40%) because SQL Server was packing in so many rows per page that even 10% empty space wasn’t enough to keep the version info on-page.

Vedran noted that as SQL Server adds new pages to existing objects (to handle inserts or rows moving around due to key updates), the fill factor setting is ignored, and data’s still going to get pumped in with 100% fill factor. In that case, Vedran suggested index maintenance jobs with a rebuild threshold of 50% might make sense – again, using a 90% fill factor so that the newly distributed data would leave enough empty space to avoid the problem.

I’d add: remember that a 90% fill factor means you’re leaving 10% empty space on every page, which means your database just grew by 10%, your memory was cut by 10%, your maintenance jobs (backups, checkdb, index rebuilds, stats updates, etc) will take 10% longer, table scans will take 10% longer, etc. I’m not dead-set against lowering fill factor – just make sure you’re solving a problem with it rather than creating new problems.

Previous Post
What If You Need to Index a Lot of Duplicate Data?
Next Post
[Video] Office Hours and an AI-Generated Sweatshirt

6 Comments. Leave new

  • This should be mandatory reading for SQL Server folks that blindly follow _any_ conventional wisdom. I’ve been saying something similar to this since around SQL 2000. I came from an Oracle background where the storage engine is radically different than SQL Server’s (for starters…there is no concept of “clustered indexes” b/c they are not _needed_). In the ora world there is no conventional wisdom to regularly rebuild indexes b/c they realized that fundamentally an index rebuild is treating the _symptoms_ (poor page utilization, bloat, etc) instead of _the cause_ (fundamentally, you got a bad physical design). The notion of requiring regular rebuilds should indicate to you that you are simply causing pages to split again.

    I love how you worded it…”what is the problem you are trying to solve”? Back in 2000 DBA folks, to me, seemed overly obsessed with disk space utilization and failing to see that they were needlessly causing page splits later, and likely causing slowness for the app users at the worst possible times.

    There are tons of examples of _SQL Server conventional wisdom_ that took decades to dispel…blindly upping the MAXDOP settings on every server, autoclose/autoshrink, various conflicting guidances on correct tempdb settings, I can think of at least a dozen just on tran log myths…

    Even the case you mention where “null defaults are populated later” (hence requiring a rebuild) is still fundamentally a design problem.

    It felt like it took decades to convince folks that _regularly scheduled rebuilds_ were something that really required a bit more scrutiny.

    _A firm understanding of 10% of SQL Server is way better than misunderstanding 100% of it_

    Reply
    • The objectives in the year 2000 would have been different. That was long before my time, but I worked with clients’ old exchange and SQL servers from even the late mid 2000s and from the old invoices, there were boxes approaching 100k dollars with less than 60Gb usable disk space. There were also performance killing features that were commonly used like single instance storage whose sole purpose was to economize the, at the time, extremely expensive disk capacity. I distinctly (and painfully) remembering fighting to free up 500 Mb on those crap boxes and it was a meaningful victory.

      I’ll agree with you on bad design, but in the SQL server world, even with enterprise edition, bad design and then throwing resources at it is cheaper than better design and development. I hate it, but in a platform where 8 more cores only cost another 60 grand instead of half a million, you are going to end up with web developers, full stack developers and ‘self-taught’ report writers doing design and development.

      Reply
      • I agree with a lot of this. Certainly I worked on machines where we ran out of capacity on a LUN and you had to free up disk space. That to me is different than merely saying, “Hey, I’m gonna reindex weekly because it removes a lot of “air” in my data…because, well, that seems smart to me.”

        And, yup, throwing hardware at a problem is often much cheaper than dealing with refactoring.

        What I’m talking about, and I think Brent as well, is that there are many DBAs that follow outdated advice as though it was dogma without truly thinking through the problem. I’ve seen so many DBAs over the years that install their set of standard “maintenance” jobs on every server they ever touch, and one of those jobs is index rebuilds. 95% of the time it probably doesn’t hurt anything or even improves things marginally…but this leads to so many conversations like “my reindex job runs weekly and I still have a lot of air in the tables, I guess I’ll run the reindex job nightly then.” And that just makes no sense to me.

        Reply
  • RCSI has been around for more than a decade, but I’m only just learning this now? I would not have expected the rebuilds to make the tables suddenly forget the row versioning overhead. Online rebuilds certainly wouldn’t.

    Reply
  • […] Brent Ozar lays out an argument: […]

    Reply
  • Whoa, and for the bonus round, why not throw in some Change Tracking and enable all three to really stir the pot? Like some crazy game show, where each update triggers 7 extra writes with the possibility of a page split for the daily double to get us to 9.

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.