Performance Tuning

Index Tuning Week: How Many Indexes Are Too Many?

By Brent Ozar · October 8, 2018 · 34 comments

This week I’m running a series of posts introducing the fundamentals of index tuning. First up, let’s talk about understanding how many indexes make sense on a table.

The more nonclustered indexes you have on a table,
the slower your inserts and deletes will go.

It’s that simple.

If you have a table with 10 nonclustered indexes, that’s 10x (or more) writes an insert has to do for all of the data pages involved with each index. (The data page writes happen later, asynchronously, during a checkpoint, but that’s still more work for storage, and there’s still synchronous blocking involved.)

I’ve seen so many DBAs that scream bloody murder about 15-second IO warnings on their storage, and then when we go to look at their tables, they have dozens of indexes on every table. We remove the unneeded indexes, and presto, suddenly their storage workload drops. Their storage does less work, which means it can react to other requests faster, and suddenly they have a cascading performance improvement throughout their queries.

People weren’t complaining about the speed of inserts and deletes – but those background operations were killing storage with a thousand small cuts.

To help performance tuners, I came up with Brent’s 5 and 5 Rule: aim for around 5 indexes per table, with around 5 columns (or less) on each.

This is not set in stone. It’s simply based on the fact that I have 5 fingers on one hand, and 5 fingers on my other hand, so it’s really easy to remember.

5 and 5 is just a starting point for discussion.

Sometimes, 5 indexes aren’t enough. Sometimes your users want to filter or sort by lots of different fields on a really wide table. However, the more indexes you add, the slower your inserts and deletes will go, and the more competition pages will have for precious memory space. You may have to throw hardware at it in the form of more memory and faster storage.

Sometimes, even just 5 indexes are too many. When you have a table where insert and delete speeds are absolutely critical, and select speeds don’t matter, then you can increase performance by cutting down on your indexes.

Sometimes, 5 columns are too many. If you’re indexing big VARCHAR(MAX) columns, or you’re index hot columns (columns that constantly change), you can still run into problems. And yes, included columns count against your total – if that surprises you, watch my free How to Think Like the Engine class to learn how pages are built.

Sometimes, 5 columns aren’t enough. Maybe you’ve got a query that has 3 fields in the filter, the combination of which is extremely selective, and they just need to return 4 fields in the SELECT. That’d be 7 fields. That’s completely okay – remember, Brent’s 5 and 5 Rule just stems from the fact that I have 5 fingers on my left hand, and 5 fingers on my right hand. If I’d have had an accident in shop class, we might be talking about Brent’s 4 and 5 Rule, and laughing about my lack of skill with the table saw.

Wanna learn how to pick which indexes you should build, and what columns should go in ’em? That’s where my Mastering Index Tuning Class comes in.

Free, 3× a week

Get my new posts by email

Three posts a week, plus a Monday roundup of the best database news from around the web.

34 comments

Steve Student since 2017

October 8, 2018 at 3:22 pm

I’m laughing at the image of the developers for our vendor app having 20 fingers on one hand and 30 on the other

Reply
1. Brent Ozar
  
  October 8, 2018 at 3:23 pm
  
  HAHAHA, now you’ve infected me with that. That’s awesome.
  
  Reply
  1. Nahid Hossain
    
    March 9, 2024 at 9:05 am
    
    You’re absolutely awesome teacher!
    
    Reply
Tom Wickerath

October 8, 2018 at 4:52 pm

“The more nonclustered indexes you have on a table, the slower your inserts and deletes will go. It’s that simple.”
Updates will also be negatively affected, as there are two index modifications (a deletion + an insert) for each data update operation.

Reply
1. Brent Ozar
  
  October 8, 2018 at 5:02 pm
  
  Tom – well, not exactly. Updates are only involved if you’re updating fields contained in the index. (I left that out on purpose – we dive into that more deeply in the class.)
  
  Reply
  1. Tom Wickerath
    
    October 9, 2018 at 3:08 pm
    
    Yes, I was implying fields contained within the index.
    I think you could make the same statement about inserts and deletes….e.g. Inserts/Deletes are *only* involved if you are inserting/deleting data in a column contained in a non-clustered index. It is certainly possible to create non-clustered indexes on columns that allow NULL. I don’t think an insertion or deletion of a row in that case would result in an index needing to be updated in that case. Am I mistaken?
    
    Reply
    1. Brent Ozar
      
      October 9, 2018 at 3:11 pm
      
      Tom – why not test your idea? It’ll only take a matter of seconds using the writes column in sp_BlitzIndex.
      
      Reply
      1. Tom Wickerath
        
        October 9, 2018 at 4:47 pm
        
        I just ran an experiment, as you suggested. A bit of a delay, as I had to first download and install sp_BlitzIndex, and create/populate a table with some data. I guess one cannot make the same statement about inserts and deletes. Me thinks this might be because a non-clustered index includes a clustered index’s key as the row locator. 🙂
      2. Brent Ozar
        
        October 9, 2018 at 4:51 pm
        
        TA-DAH, so awesome. Next watch the How to Think Like the Engine videos. They’re totally free and they’ll open your eyes to a lot of stuff.
    2. Brent Ozar
      
      October 9, 2018 at 3:13 pm
      
      And to think through it a little deeper when you’re doing the experiment – if you wanted to find all the rows where a column is null, would SQL Server use an index on that column? If so, how would that mechanically work? I love stuff like this – if you haven’t watched How to Think Like the SQL Server Engine, take some time today to watch that. You’ll find a lot of neat stuff in there.
      
      Reply
      1. Tom Wickerath
        
        October 9, 2018 at 3:35 pm
        
        > If so, how would that mechanically work?
        I wasn’t addressing whether or not SQL Server would *use* an index, as part of compiling a query plan. I was only addressing your statement:
        “The more nonclustered indexes you have on a table, the slower your inserts and deletes will go. It’s that simple.”
      2. Brent Ozar
        
        October 9, 2018 at 3:46 pm
        
        Totally understood. Now run the experiment. 😉
Jason Wynsen

October 8, 2018 at 7:47 pm

This seems like a really irresponsible guideline and encourages database administrators and engineers to reduce their involvement with properly designing a system based off of gathered requirements. It’s not only irresponsible, it’s just lazy.

Here’s a new 5×5 rule. Every month, take 5 indexes that you can tune and 5 that you can drop (based on their usage statistics). My left hand is used for optimization. My right for destruction. This is a far more effective use of a DBA’s time and will be far more impactful than just telling someone they can’t deploy more than 5 indexes.

Reply
1. Brent Ozar
  
  October 8, 2018 at 7:48 pm
  
  Jason – try reading the whole post. In absolutely no way did I say “they can’t deploy more than 5 indexes.”
  
  Reply
  1. Jason Wynsen
    
    October 9, 2018 at 12:28 pm
    
    I get that, but you have to admit that most people will interpret this as a hard and fast “Ozar” rule. I understand the spirit of what you’re trying to say, I’m just worried that others will misinterpret it and start spreading awkward “best practice” recommendations – like, “don’t worry about index fragmentation” leading to not bothering with index maintenance at all.
    
    For instance, I have a table with 14 indexes. Each of which covers a respective query and performs perfectly for the given requirements. According to the index usage statistics, along with the query execution statistics, this design is doing just fine. The table is not a high-insertion entity so we’re not concerned with any sort of DML related performance issues. We take the approach of investigating overall index performance, rank them by impact, and tune accordingly.
    
    I suppose this is actually what you’re getting at and you’re just trying to get DBAs started with how to dive in. In which case, I’m sorry for being a jerk.
    
    Reply
    1. Brent Ozar
      
      October 9, 2018 at 12:37 pm
      
      No worries, apology accepted! That’s exactly the point of the post, showing them how to dive in.
      
      Reply
Chad Estes

October 9, 2018 at 10:20 am

Now we know why Brent is so found of that Jazz Hands caricature…

Reply
Kevin

October 9, 2018 at 10:51 am

So you are saying that having a table with 33 indexes, one of which had 998 columns in it is a bad idea?

Reply
1. Brent Ozar
  
  October 9, 2018 at 10:52 am
  
  (insert consultant poker face here)
  
  Reply
  1. Adam C Jacobson Student since 2019
    
    June 11, 2021 at 5:16 pm
    
    and I thought this was bad
    256 columns in tables (including 120 of them are GUIDS). And 68 indexes
    
    Reply
Mark

October 9, 2018 at 11:23 pm

I’ve seen you mention this loose rule before and it gives normally me some pause. We run a bespoke ORM kind of business layer in our application where we have many objects that represent a row in a table, but they might be retrieved but other columns than the identity, so we have a small number of indexes (one per predicate) but each index will include the whole table, basically.

The upside for us is our business objects can load quickly and have all values on a row, but the downside is any insert/update obviously is doing many more writes.

In some instances we have different objects which are represent a search result spanning multiple tables – these normally only include a subset of the columns and fit the 5×5 rule more closely. But we can’t index for those specifically without removing the fields from the indexes that the main objects use (since the same leading column of the index is used, meaning duplication).

Our application reads more data than it writes so trading off extra writes vs key lookups on the reads (or significant rewriting so we don’t need all the data immediately on load and can have narrow indexes) seems the right trade off to make.

So that’s the question – are key lookups potentially as bad (or worse) than more writes?

Reply
1. Brent Ozar
  
  October 10, 2018 at 5:55 am
  
  Mark – that’s where all the text below the picture comes in, and you have to read that part. Getting the exact right number of indexes in your environment involves rolling up your sleeves and doing some work. I do wish I could do that work for everyone for free, and we do as much as we can in our open source tools like sp_BlitzIndex, but for personal advice, that’s where our consulting comes in.
  
  Reply
  1. Mark
    
    October 11, 2018 at 1:36 am
    
    Fair enough!
    
    I’m a somewhat long time reader, first time caller and it’s through your guys’ fine work that we’ve got as far as we have. Our application performance is pretty good currently but I always look for things that can be done better.
    
    That you didn’t ask me if I’d been drinking is encouraging.
    
    Reply
Toby Ovod-Everett

November 2, 2018 at 11:42 am

Guy Steele (in “Coders at Work”): “If I could change one thing – this is going to sound stupid – but if I could go back in time and change one thing, I might try to interest some early preliterate people in not using their thumbs when they count. It could have been the standard, and it would have made a whole lot of things easier in the modern era. On the other hand, we have learned a lot from the struggle with the incompatibility of base-ten with powers of two.”

In other news, you may want to investigate Saw Stop table saws. The videos are pretty cool. Now if I only I could convince Microsoft to release a version of SQL Server that has a similar feature for Scalar Functions. Any time a developer tried to create a Scalar Function, SQL Server would immediately halt and make them spend $100 before they could restart the service! Then if I could somehow get all of our vendors to use that version for development, I’d be so much happier.

Reply
Greg

November 2, 2018 at 1:13 pm

The TL:DR for this article is, it depends. I work on an ERP system that has 10 or 15 indexes on most tables out of the box, and still experience poor in performance on some queries or get hints for missing indexes. With an app like that we can’t remove them due to hints in the code. Its something we have to live with and need to be creative to gain better performance.

Reply
Eric Russell

November 5, 2018 at 10:07 am

Start by staying on top of what existing indexes are not used or indexes where the write to read ratio is high. The use cases and data access patterns for an application can drift over time, making once useful indexes obsolete. Still, what can be potentially worse than an update with too many indexes – is an update without a covering index, which results in frequent and extended periods of blocking.

Reply
Number2 Student since 2022

December 10, 2018 at 3:22 pm

What about a scenario where you have a date table. Something super static that has zero inserts. We insert like 100 years worth of dates in it and there it sits. I see easily 30 indexes on it for different range combinations (last/first day of year, last/first day of month, last/first day of week, etc.) Sure, many of them are rarely used, but is there really any harm in a table with 36K rows in it with no updates having so many indexes? I figure, small table, no updates, no harm.

Reply
1. Brent Ozar
  
  December 10, 2018 at 3:23 pm
  
  Number2 – read the last section of the post about the starting point. Enjoy!
  
  Reply
Alistair

September 29, 2020 at 2:33 pm

I have recently created 40 indices on a table, some with up to 15 columns, and i am content. It usually only updates over-night once a month, but it is constantly used for ad hoc queries. I am very happy with its performance and only now becoming grateful that had I those extra hands grafted on.

(NB This is a fatuous comment and is not intended to encourage reckless cosmetic surgery, merely to demonstrate that the rules are more like guidelines – always think about your precise requirements when applying them)

Reply
Ervin Santiago

April 21, 2022 at 1:51 pm

Just started as the DBA with a company and looking at SQL Jobs that run long and found one that is just doing a simple update to a table but taking nearly 30 minutes. I dug into the database from their software vendor, i found the table being updated has 32, yes 32 indexes. Ran Sp_BlitzIndex on the table and around 10 have 0 reads, 4 or 5 are almost identical..shoot me now please

Reply
RBZ

September 9, 2022 at 6:38 pm

What happens with indexing a JSONB column?

Reply
1. Brent Ozar
  
  September 9, 2022 at 6:41 pm
  
  What do you mean what happens?
  
  Reply
Tom Snipes

March 27, 2023 at 12:33 pm

I’m going to have to do some deeper research, but since the Indexes I use, are advised and built by the Index Advisor, I don’t believe they are readable with field data. With this being the case, I also don’t believe that it takes the same amount of time to update one record of the index, as compared to updating a field in my table. We are not 6NF, which you would need to be very close to, for your 1:1 writing speed to apply.

Any idea what the actual write speed to an Index is, compared to writing to a table? An IBM SQL guru would be great to hear from.

Reply
1. Brent Ozar
  
  March 27, 2023 at 12:36 pm
  
  Head to https://www.brentozar.com/go/engine for my free How to Think Like the SQL Server Engine class. You’ll get a lot of fun eye-opening info out of there, like the smallest unit of data that SQL Server reads & writes to/from disk is an 8KB page. Enjoy!
  
  Reply

Index Tuning Week: How Many Indexes Are Too Many?

5 and 5 is just a starting point for discussion.

Get my new posts by email

Keep digging

34 comments

Leave a comment Cancel reply