Good Reasons to Rebuild or Reorganize Indexes

No, not that

It’s not the F word.

Not like this embarrassing page that was somehow updated in 2017, which still tells you to reorg at 5% and rebuild at 30% fragmentation, like your data is still on a drive that looks like a record player and might still start making a clicking noise when it fails.

It also makes no mention of if this matters to data in memory.

You do have more than 3 GB of RAM in your 64bit server, right?

Fragmentation exists when indexes have pages in which the logical ordering, based on the key value, does not match the physical ordering inside the data file.

Demos

When’s the last time you saw a good demo of index fragmention causing a problem?

About the smartest person I’ve ever met, Joe Sack, once did a demo where he set Fill Factor to 5% — yes, that means the pages were 95% empty — and then scanned the entire index.

That’s empty space, not pages out of order, like the Microsoft docs page talks about.

Empty space. Caused by Fill Factor.

95% of it.

Let’s do the split

Some people will tell you that if have a lot of page splits, you should reduce fill factor.

This sounds reasonable. Sort of. With some empty space on the page, you’ll have a little more room available to insert values, which will reduce page splits.

The problem is that fill factor only gets applied when you rebuild an index, and you need to rebuild or reorg to apply that fill factor, and that lower fill factor is… Fragmentation!

If you chase that snake until the sun rises: in order to prevent logical fragmentation caused by page splits you need to introduce physical fragmentation with fill factor which is only respected when you rebuild or reorganize an index, which you do in order to remove logical fragmentation.

So why do these commands exist?

Well, I’d wager some of it is backwards compatibility. I still see questions about SQL Server 2000/2005 pop up on Stack Exchange.

There are some good reasons to Rebuild an index, like updating statistics.

Er…

Okay, maybe not that. You can always update statistics on their own.

But if you need to change something about an index (that hopefully isn’t Fill Factor), like:

  • Add some Partitioning magic
  • Add compression
  • Change an index definition
  • Change a Filegroup
  • Try to fix corruption in a nonclustered index

An easy finger to point

Index fragmentation is an L1 support tech’s Mr. Boogedy.

I’ve heard it blamed for every conceivable database problem, and so the witch must be burned every night.

The thing is, most of those witches are just nice cat ladies who keep odd hours.

Thanks for reading!

,
Previous Post
Book Review: Database Reliability Engineering by Campbell & Majors
Next Post
So You Want a Cheap Disaster Plan

53 Comments. Leave new

  • Emanuele Meazzo
    November 28, 2017 8:36 am

    I like to rebuild indexes just to remind to the SAN that these pages are frequently accessed, so I want them on a fast tier because I’m important, and to screw with the san admin

    Reply
  • It all depends.. We had a 60GB database that the previous admin only updated stats; after preforming index maintenance that database has 15GB free space.

    Reply
  • Any comment on what good general thresholds would be reorg/rebuild in a few hypothetical scenarios? I’m fairly new to DBA work, and work in consulting at a contract job heavy consulting firm, so a lot of the times in SQL environments I set up, I only get to see them while I am building them and only have vague descriptors of how they will be used. My starting point is usually re-organize at 10 and rebuild at 30 with fill factor 90 and adjust each of those depending on the data I get about usage, but a lot of the times I am guessing. I generally always have disk with good performance, enough RAM but somewhat commonly the client will not have licensing for adequate CPU. When I have SSDs and 10 Gbps iscsi to work with – I have the thought that first, fragmentation isn’t going to matter as much, but on the same note, the performance of storage is so high, that more aggressive index maintenance isn’t going to matter as much either.

    Reply
    • Unless you can prove that the index maintenance is helping anything, you shouldn’t be doing it. Just update stats.

      Reply
    • Fill factor should always be 100 (or 0, the same thing) unless you have performance statistics that prove otherwise.

      Reply
  • So, at the end of the day all you’re saying is… “It depends… on the storage…”

    Reply
    • No, I’d say having data in memory is most important. If it’s not, storage and pathing becomes important.

      Reply
      • But hasn’t that always been the case? Logical I/O has always been faster than physical I/O? Even with a beast of a SSD SAN, logical is still faster.

        Reply
        • Sure, but it’s a lot easier to have a lot of data in memory these days than it was when guidance around fragmentation began getting published. There were no SSDs, no flash, no SANs, and yet the same advice about freaking out over 5% fragmentation persists.

          Reply
          • Yeah, 5% is low. But if it’s just doing a reorg in a small to moderate sized index… and the time for maintenance is there… meh, so what 🙂

          • You can choose to care or not care about things as you wish. It’s your server 😀

  • I’ll be honest, I’m going to have to do some testing on this one. I still have it ingrained that heavily fragmented indexes lend to increased I/O, thus reduces performance. I have clients with a mixture of storage (DAS, SCSI SAN, SSD SAN, etc.).

    Reply
  • Kris Gruttemeyert
    November 28, 2017 2:15 pm

    I’ve yet to see one real-life case (Joe Sack’s example is far from anything ‘real-world’ –but what’s real anyways?) where rebuilding an index (by itself, no stats update) has fixed a major performance issue. I have, however, seen a stats update, by itself, make a server go from bogged down to lean, mean and slippery just by updating stats that influenced a regressed query.

    Also worth noting that index rebuilds don’t update columns statistics, they only update their corresponding index statistics. Updating stats does both (and goes parallel without FULLSCAN in 2016+). So, there’s that little nugget of delight too.

    Rebuild stats, save your IO, profit.

    Reply
    • Straight A student!

      Reply
    • I’ve seen real life case. Execution plan changes for specific problematic query from merge join <- sort <- index scan(index fragmentation 52 %, statistics freshly updated) to hash join <- index seek (after index was rebuilt with fillfactor 0)

      Reply
      • Ooh that would be a nifty thing to demo. Do you have a repro of that?

        Reply
        • I think that optimizer doesn’t know a thing about fragmentation, but amount of data. The culprit is in a number of accesed pages and borderline filter. In the fresly rebuilt index with fillfactor 0 number of pages is minimal. The demo will be very similar to Joe’s( Joe Sack), but less obvious.

          Reply
          • However it doesn’t explain why optimiser choosed index scan with more pages and seek with less in my particular query. So i would go with bad statistics sample. Next time I will definitely do UPDATE STATISTICS WITH FULLSCAN before attempting to rebuild an index to eliminate bad sampling.

  • Hi. Does the same logic apply to data warehouses which are 20-40TB in size? All the data files are hosted on HDD, not SSD.

    Reply
    • I can’t even imagine how long it would take to check indexes at that size for fragmentation, and then defragment them. I’d rather be checking that thing for corruption 🙂

      Reply
  • Don’t forget columnstore! It’s currently possible to create an infinitely fragmented columnstore index. It won’t even get cleaned up by the tuple mover in SQL Server 2016.

    Reply
  • This 2011 post examine whether reindexing was needed with sdd (sql 2008) https://www.sqlskills.com/blogs/jonathan/does-index-fragmentation-matter-with-ssds/
    The tests indicated that there is an impact on both space and IO Eric’s comment takes a slightly different angle.

    There are more considerations than just fragmentation percent to rebuild or reorg:
    •Do queries even use this index?
    •Do I really want to reorg and LOB compact a 50+ GB index (single threaded, )?
    •Am I on Standard Edition where rebuilds are OFFLINE and can cause blocking
    etc.

    Reply
    • S Jones – make sure you read Jonathan’s entire post very carefully, especially the wrap-up paragraph that says:

      The actual runtimes of the two tests were nearly identical, and often flipped back and forth between which one took a few milliseconds longer than the other to complete for this demo.

      Reply
  • When I see a claim that is counter to common wisdom that isn’t backed up by any strong data , I am skeptical. I’ve found that queries run faster when the size of the data is smaller. Yes, memory is cheap, but don’t queries against memory still take time? If you do a hash match between 2 tables, doesn’t the cost go down if both tables have less pages? I guess my premise is this:
    1) With the same operation, queries with less pages take less time
    2) Defragmenting indexes usually make the number of pages in a table smaller.

    As for the proper fill factor, that can be discovered, and if it is less than what an unmaintained index, then it will take less space. And if you never reindex, then the number only gets larger over time. That costs time for everything. Is the time greater than the time to reindex? Maybe no at first, but if we have larger fragmentation, versus a database that has index maintenance done nightly, then the difference can be stark.

    So, I’m not sure what you mean by “cause problems”, but more data means more time to read data, regardless of disk, memory or CPU cache.

    Reply
    • I look forward to your blog posts where you test your hypotheses out 🙂

      While you’re here — do you mind pointing me to any strong data in favor of rebuilding that you’ve seen? Anything that supports common wisdom on the matter?

      Thanks!

      Reply
      • I read Brent’s article linked here and I want to thank the both of you for the thought provoking piece. I am on the fence.
        For:
        1) Extent fragmentation and file system fragmentation are less relevant with VRAID and SSD
        2) Page splits are expensive and a default of 100% fill factor causes more splits

        Against
        1) More pages in memory with index maintenance (assuming limited resources)
        2) Everything is larger with unchecked page-level fragmentation

        I’ll think about this some more, but I am still skeptical.

        Thanks!
        Eric

        Reply
    • Eric – can you give a quick rundown on your method to do this:

      As for the proper fill factor, that can be discovered

      Reply
      • Well, if you have a problem table and the default fill factor causes a lot of page splits, you gradually increment. This is expensive, as it requires personal attention, but some problems warrant that kind of attention.

        Reply
    • So the common wisdom claims that “the practice of reindexing prevents some performance problems”.
      It’s provable but not refutable. To refute that would be the same as proving “No problems are prevented by reindexing”. That’s where the challenge comes from. It’s important for team-reindex to demonstrate the problems that are avoided.

      The best I’ve been able to come up with is a demo where an index scan doubles the number of page reads using a cold cache. For every demo I’ve seen, either the performance impact has been around 1 millisecond of difference or it’s not a demo that represents something we’d see in production.

      I’ve not yet seen a production issue that was tied to fragmentation or page-density that was too low. Maybe I’ve been lucky, but I don’t think so.

      Reply
      • I would have thought the common wisdom is something like “Over time, CL and NCLs get bigger with DML, so here is a mechanism to undo that known issue with these algorithms and data structures.” So the benefit is, macro level, less space used, which means more memory for data cache and less time for scans. See arbitrary example in another reply. As I said in another post, I see that the benefit wanes with the extent fragmentation being less relevant, but is the case for no index maintenance as strong as the case for? I am still figuring that out, since reading, but is this a sacred cow that needs slaughtering? Dunno yet. And does this span RDBMSs? These algos and data structures aren’t unique to SQL, after all.

        Reply
        • Eric — It’s been a pretty common strain of thought in the Oracle world for a long time.

          If you search back through the Ask Tom archives, there’s more hits for it with demos, etc.

          A couple questions:

          Over how much time? A day? A week? A month? If you rebuild an index and the next night it’s 30% fragmented, and then a week later it’s 30% fragmented, and then a month later it’s 30% fragmented, is it ever NOT 30% fragmented?

          If you lower fill factor, how long does it take to get to 30% fragmented? If you lower fill factor to 70%, your index is already 30% fragmented when you rebuild it. You’re leaving 30% free space on the page.

          Here’s the thing: if you can prove fragmentation is THE BIGGEST PROBLEM a server is facing, and that the time and resources you expend fixing fragmentation saves you a commensurate amount of time and resources in query performance, go ahead and fix it. Fixing it by default isn’t solving a problem.

          I’d like folks to be more mindful of these “common wisdom” activities. If you’re using Log Shipping, Mirroring, or AGs, you can fall way behind on your RPO sending all that fully logged activity across the wire.

          Thanks!

          Reply
          • I’m not looking for problems, I’m looking for beneficial properties. While I have never proposed that index maintenance is the main source of problems, I am suggesting that less space and more data in memory are beneficial properties that I like. It makes people happy when I exchange off-hours IO operations to use less memory and CPU. I have also never run into a database for which page splits were the primary problem. Granted, that’s anecdotal, but I also have a feature in SQL Server that allows me to get these properties. It’s been around since I’ve been using it.

            As for the cost – index maintenance generates a lot of logging that can muck with the features you outline. Perhaps not doing maintenance is beneficial there. And I am with you on that.

            At any rate, I appreciate the exchange. It gave me more to think about. I am not totally sold on stopping, but you fleshed out your ideas a bit more and I was able to understand where you’re coming from.

            Cheers,
            Eric

  • Erik – my 2 premises are built in to the algorithms SQL Server uses. And it is simple math. If an IO operation in memory takes x nanoseconds, then twice as many operations take 2x nanoseconds. So, if a hash join has time complexity of O(n) on average, that holds true, but if we have a bad case of O(n^2), then we have x^2 nanoseconds. (http://prestodb.rocks/internals/the-fundamentals-join-algorithms/)

    Now, if we have an access time of 2 nanoseconds and an input of 10,000 pages to this algorithm, the time taken is 20,000 nanoseconds. Let’s say fragmentation expands to 15,000 pages. That’s 30,000 nanoseconds.

    If you have a well-tuned database, and your queries against your hottest tables take 1.5x longer, is that trivial? If your backups take 50% longer? CheckDB? Do I need to do an example that shows more pages = more time?

    I can tell you from the stats on a client I set-up index maintenance on who never did it before that the window for maintenance is pretty small compared to checkdb. In addition, more pages fit in memory. All I had to do is grab Erland Sommerskog’s script, run and schedule.

    Reply
    • The worst case for hash join (O(n^2)) occurs when all rows for the build side hash to the same value. What does the number of accesses to a hash table have to do with fill factor of the underlying table?

      Reply
      • Number of pages is n. Fill factor affects number of pages.

        Reply
        • A hash table exists in memory and possibly tempdb. You’re saying that if the underlying table has a fill factor of 1% then the hash table will be 100X bigger than necessary? That’s an interesting claim. I look forward to seeing a demo of that!

          Reply
          • Nope. I’m saying that the source the hash table reads from is going to be larger, therefor take more time. Since we are building hash tables from the rows, not the pages, the effect goes away when the hash buckets are built. That being said, there is a cost associated with reading tables with more pages versus less pages. Try it for yourself. I did with FF100 and FF30. The difference is more pronounced with merge joins, since building hash tables is so expensive in and of itself.

            Eric — no offense, but this isn’t a good place for long demo code. If you wanna blog about it and link to your blog, I’m totally fine with that. I may even link to it in the article.
            – Other Erik

    • You lost me here.

      In your other comment, you want to figure out the correct fill factor (presumably not 100% since you’re worried about page splits), which introduces the ~empty space~ you’re worried about making things take longer when you rebuild/reorg those indexes.

      Do you happen to have a link to Erland’s script? I’m not sure which one you’re talking about.

      Reply
      • It’s a matter of choice, I suppose. Without maintenance, over time, fragmentation can get really bad. Like a 1:10+ ratio bad. So that is cost number one. Lots and lots of pages. Fragmenting too much gets you lots of page splits. So, is there a middle ground? Can you recover 40% of the pages (real case) without destroying the server. I’d say yes. And you get a bunch of memory back to boot. So, is there equilibrium? That’s my question to the both of you. Or should we throw out the baby with the bathwater, so to speak and stop doing index maintenance. Again, I appreciate that you guys are challenging up as a community, but I’m not one to throw away something that has worked for me because it is less relevant. I mean, I love my autoshrink on. Can’t get rid of that!!!1!!!1

        Eric

        Reply
  • I’m tempted to mention recovering deleted space on heaps as an important edge case where rebuilds can demonstrably fix performance issues, but you did specify that the post is about indexes not heaps.

    Reply
    • Quite hopeful that more people will abandon the index defrag cargo cult that’s pervasive in the SQL Server world (and not so much among people who work with other RDBMs, even MySQL), btw.

      Reply
    • Hi James,

      Yeah, that’s definitely a time when I’d want to do it, or if the HEAP has a lot of forwarded records.

      I’d also want to be really careful, because rebuilding a table that’s a HEAP will also rebuild all of the nonclustered indexes on it — some folks have nonclustered primary keys and no clustered index, plus normal nonclustered indexes.

      Good times.

      Reply
  • This is a great conversation / thread.

    I’ve found that rebuilding helps in the following scenarios

    Low latent, high tps workload (>100k/sec) AND PLE dips frequently (saw tooth pattern)

    Very large objects where a re-build gains back 10-100’s of GB AND PLE dips frequently.

    I’ve seen both of these scenarios in VLDB’s (defined as > 10TB). Updating stats is always stop #1; when that’s not enough, rebuilding an index or three has shown to bring down run times – specifically in the two scenarios mentioned above. In scenario one, the query plan remained the same. In scenario two, the query plan changes because there’s enough memory. *Note, both workloads are on different servers with 2tb of ram / per.

    As always, ymmv, but key rebuilds are necessary. However, I don’t think that’s the point Erik is making the article. I read it to say “don’t blanket a db with a full re-build of everything just because” – have a reason. and ensure that reason can be measured.

    Reply
  • Hi I just tested whether rebuilding a heavily fragmented clustered index on a dev database could affect performance, and it did.
    DB: SQL Server 2008
    Table: 930340 rows with a Clustered Index on a BigInt
    Pre-rebuild
    78.55% page fullness and 81.11 total fragmentation
    I ran this query to cause an index scan:
    SELECT * FROM dbo.TestTable
    WHERE non_indexed_col1 >’k’ OR non_indexed_col2 like ‘US%’

    Pre-rebuild the query had an average duration of 56 seconds over 3 runs. Post-build the average was 43 seconds
    Logical reads was 83540 before rebuild, and dropped to 67143.

    Seems fairly conclusive – rebuilding a fragmented index can improve performance.
    See this link: https://www.brentozar.com/archive/2012/08/sql-server-index-fragmentation/

    Reply
    • Erik Darling
      March 16, 2018 1:13 pm

      Right, it couldn’t possibly have been that rebuilding the index read it into memory so it was ready for your query, and before that you had to read stuff from disk.

      🙂

      Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.

Menu
{"cart_token":"","hash":"","cart_data":""}