DBA Training Plan 10: Managing Index Fragmentation

In the last episode, we looked at your index designs with the output of sp_BlitzIndex. You might have gotten a little overwhelmed what with all the different warnings (and all the learning resources!)

You might have thought, “Is there an easier way?”

“Is there a way I can just run a job, and have something automatically clean my indexes up?”

DefragAnd that’s how folks start focusing on fragmentation rather than the design of the indexes themselves.

Those of us who got our start on Windows back in the days when it ran on spinning rusty magnetic frisbees called “hard drives” used to run apps to defragment the data. These apps would shuffle bytes around, getting them all lined up into contiguous rows. The idea was that if your data was all in a row, then your spinning rusty magnetic frisbees would be able to read the data faster. See, the way they worked internally was that they had a head that physically moved around the drive, reading one place at a time, just like a record player.

What’s that?

What’s a record player?

<sigh>

Music used to be stored on vinyl discs. No, not the compact ones – those were plastic. This is before that. No, even before tapes. These vinyl discs had little grooves in them that would cause a needle to vibrate, and …

NO I’M NOT MAKING THAT UP, IT WAS REALLY A THING.

Lemme just stop there. The whole “defragment your drives” advice is really, really old, that’s what I’m saying. It stems from a day and age where technology was completely different than it is today. Your SQL Server’s data files probably don’t live on spinning rusty magnetic frisbees anymore, and even when they do, they’re scattered amongst lots of other data files – even from different servers all sharing the same drives. All these different data files are being accessed at the same time. The whole reason we used to defragment our data is just dramatically less relevant today.

Today, a lot of your data is stored in RAM. You know what that first R stands for? Of course you don’t, kid, just like you don’t remember that cars used to have record players. <sigh> RAM doesn’t stand for Sequential Access Memory, and you don’t need all your data pages lined up in a row in RAM. A table’s pages aren’t all lined up in a row in RAM, nor does that matter. (Does fragmentation matter if you hit disk? Yes, but then you’re screwed anyway.)

To make matters worse, a lot of folks try to “fix” fragmentation by setting fill factor lower than 100%, purposely leaving empty space on each 8KB data page so that they can add more data later. When they do that, they:

  • Make the database larger (which also makes their backups, restores, corruption checking, index rebuilds, etc all take longer)
  • Make their memory smaller (because each 8KB data page will have a percentage of empty space, and that space is cached in RAM as well)
  • Make table scans take longer (because more logical reads are involved)
  • Give a unicorn heartburn

So what’s a DBA supposed to do?

Leave the server-level fill factor at the default 100%. If you can prove that the default is causing you a problem on a specific index, I’m okay with lowering it on specific indexes, but I’m going to make you prove you made things better instead of worse. If you want to do that, read Jonathan Kehayias’ post on how to track page splits with Extended Events, and make sure to read the comments too.

Run Ola Hallengren’s index maintenance scripts weekly. We talked about these free tools when we were covering backups and restores, and they’ve got more tricks up their sleeve: index reorganization, rebuilds, and statistics updates. I’m not a fan of their defaults, though: start with these less-aggressive defaults instead. Avoid running ’em more than once a week, though, unless you can prove that rebuilding your indexes & updating your stats is fixing things (rather than making ’em worse.)

And if your mentor told you that you have to run daily index rebuilds in order to fix execution plans, I’ve got good news and bad news. The bad news is that your mentor was wrong – but the good news is that I’ll teach you the real problem over the next couple of episodes.

Wanna learn more? Here are a few more resources for this episode’s topic:

  • Why Defragmenting Your Indexes Isn’t Helping – a 1-hour video where I explain the two kinds of fragmentation, show why one kind does matter, and show in more detail why setting fill factor is such a bad idea
  • What’s So Bad About Shrinking Databases? This common maintenance task actually causes fragmentation, which is especially terrible if you’re doing both this and defragmenting your indexes.
  • Columnstore Online Index Rebuilds – Columnstore indexes need more attention to maintenance. Niko Neugebaeur is the go-to for in-depth columnstore tips, and in this post, he analyzes the online rebuild process and its overhead. (You’re definitely gonna want online index rebuilds when you’re dealing with big data warehouses.)

Now that we’ve got the right indexes in place, the next episode will tackle how queries are executed.

Previous Post
DBA Training Plan 9: My 3 Index Guidelines
Next Post
DBA Training Plan 11: The Basics of Executing a Query

14 Comments. Leave new

  • So, I have gone to strictly doing daily statistics updates – generally in the morning, a few hours before the bulk of the work force comes in (we are a 24×7 shop, but the largest portion of the staff is from about 0700-1800) and after the majority of the nightly jobs have completed. I only do index rebuilds on a manual basis now. Even then the majority of the cases where I do an index rebuild is when going from one version of SQL server to something newer, but I think that the effective result from that is just a statistics update. I just take the opportunity to do it when the app is down. I had only seen a big result in testing from it one time, going from a 2016 dev server with tf 4199 off, to 2017, where there was a bug in the CE in 2016 (that was fixed by tf 4199) that caused an overestimation which made a query run better, that got fixed in 2017 whether tf 4199 was on or off.

    From the recommended settings, I think the point is just to rebuild the indexes when fragmentation gets extreme and ensuring it gets done – or is it that there is a case to do regularly scheduled index rebuilds?

    My concern with scheduling them to be automatic is that when it suddenly runs it could create hundreds of Gb of transaction logs we wouldn’t be able to handle without manual intervention and then create more problems than it solves.

    Reply
    • Good news! Statistics updates don’t generate hundreds of gigs of logs. They’re only modifying one 8KB page per statistic.

      Reply
      • Yep – to be clear on your point though, your suggestion was that you only do anything physical with the indexes when fragmentation gets extreme, and typically not something that happens on a regular basis?

        Reply
        • Gotcha – so are you thinking a very fragmented index required more log space to rebuild than a barely fragmented index?

          Reply
          • No, just the total size of the index. If we normally only do statistics on an index of several hundred Gb, which produces very few logs, and then it suddenly rebuilds when it passes 30%, it’ll produce a lot.

            I’m a little bit slow and decaffeinated this morning if I’m not making sense. i just thought that the guidance on index rebuilds now should be an exception event and not a regular scheduled event. Like after a big delete, or modification to a lot of non-LOB variable length fields etc. If I understood your video on youtube from a few years ago correctly, my takeaway was that index re-orgs are essentially useless, that the primary performance benefit from rebuilds came more from the new statistics and less from the physical re-organization of the index, and that necessity of index rebuild frequency is hard to predict. I think you used the stack overflow user account table, where accounts would come and go and eventually turn into swiss cheese, but would be hard to quantify how often it needs a rebuild.

          • Keith – OK, gotcha, may need to re-read this post really carefully. I’ve got the takeaways in there and everything.

  • Holy nostalgia trip Batman! I’d forgotten what the Defrag process looked like in old Windows. I kinda miss it now…

    More relevantly, I’ve been having this very discussion at work with a few folks and they have universally said “That makes so much sense, why haven’t done that already?”. Mind if I borrow your analogies to make the case to the less technical staff as well? 🙂

    Reply
  • Ken Hemmerling
    August 13, 2019 1:58 pm

    I stumbled across a table with a uniqueidentifier primary key named ActivityId and columns StateCode (int) and ActivityTypeCode (int) along with many others. The tables has 13M rows and there are 94 unique values for ActivityTypeCode with row counts ranging from 1 to 4M. OK, here’s the weird part: I have 97 different filtered indexes on (StateCode, ActivityId) where the only difference is the where clause on (ActivityTypeCode=(_some_value_)).

    To my eyes, all 97 indexes could be dropped and replaced with one index on just StateCode (since ActivityID is the primary key and is already included as a hidden column). My question is: Is there any reason why I would want to keep multiple filtered indexes? Could they reduce deadlocks somehow? I’ve done some testing and I don’t see any reason to have so many indexes. Thoughts?

    Ken

    Reply
  • “(Does fragmentation matter if you hit disk? Yes, but then you’re screwed anyway.)”

    Wish that was in big, bad, bold, type… Trying to convince customers that RAM is cheap, solves many problems ,like frag, and ups overall performance of SQL by having that data-stuff right next to those powerful CPUs you splurged on, for one of the lowest spends, is still like pulling teeth. Even with a 12Gb/sec, or better, all-flash SAN, if you hist disk….. (fill in the obvious…) We plan for a 2::1 Actual data-size::RAM ratio – exceed that, and expect degradation, and that includes having to add defrag if you’re still on spinners…

    Care to comment, Brent, on whether it’s advisable to run defrag even if on local SSD or all-flash SAN if it’s very clear that frequent physical reads are taking place, due to exceeding that 2::1 ratio…? (Excluding backup and DBCC CHECKDB activity, to be clear, this is business-oriented, in-day, *frequent*, high-volume reads) ‘twould help have an authoritative voice contributing to a none too infrequently raised issue, especially by SAN-folk not acquainted with SQL. Thanks! And for the reminder-post.

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.

Menu
{"cart_token":"","hash":"","cart_data":""}