It was a dark and stormy… Oh, wrong story. It was actually a warm, sunny afternoon in Charlotte, NC. I was presenting “Index Methods You’re Not Using” at PASS Summit. In this talk, I discussed how indexed views, filtered indexes, and compressed indexes can improve your query performance by reducing I/O.
From stage right, an intrepid audience member raised his hand and asked, “Can you think of an example of when you would use filtered indexes instead of partitioning?”
This question left me speechless (temporarily).
My first thought was one of requirements and practicality: “What if you have Standard Edition? You can’t use table partitioning, but you could create an indexed view” I said.
The better question would have been, “What are you trying to accomplish?”
On the surface, filtered indexes and partitioning may appear similar: both are designed to hold only portions of a table’s data in separate structures. However, how they go about doing this, the use cases for each, and the requirements for each are very different.
Filtered indexes store only a portion of the data in a table. For example, in an orders table, instead of storing all orders in a nonclustered index on the “quantity” field, we could create a filtered index to only store orders with a quantity greater than or equal to 20, or between 10 and 19. In this case, the only column stored in the index leaf pages is the quantity (and the clustered index key).
Sample of data from the table
What a nonclustered index on Quantity would store
What a nonclustered index on quantity, filtered to include quantity between 10 and 19
Let’s say the data in the underlying table changes – someone wants to increase their order quantity from 750 to 1,250. The row is updated at the data level – the clustered index. At the same time, the nonclustered index must also be updated.
Table partitioning breaks the whole of the data into separate physical structure based on a partitioning key. It’s generally used to improve query performance or make administration of large tables easier. Partitioning is based on a schema and a function. The table contains a partitioning key – the field in the table that contains the value you’ll split your partitions on. Most commonly, this is a date field of one sort or another – tables are partitioned by day, month, or year.
To directly compare this to a filtered index: could you use an order quantity as a partitioning key? Yes, as long as it’s a valid data type. The difference is that the partition is going to store all columns from the table – not just the key. When you insert or update a row, SQL Server would have to look at that value to determine which partition to place it on. It would then perform the update on the appropriate partition.
A table, partitioned on quantity – 1-10, 11-20, 21-30, etc
Want to know more about partitioning? Start here.
Comparing How They’re Used
When it comes to reading the data, if you have a filtered index for a specific value, and the query optimizer can use that index, you can often reduce I/O by orders of magnitude because you are storing less of the data in the index itself. With partitioning, SQL Server has to determine which partition the data is stored on, then access it. This can be helpful on very, very large tables – but the care and time taken to implement it and the upkeep required mean it must be very carefully considered and maintained.
Which Should I Choose?
There are many features in SQL Server that may appear similar, but once you look at how they work and what they do, they are very different. Database mirroring is not the same as replication. Change Data Capture is not the same as Change Tracking. Filtered indexes and partitioning solve two very different problems. When considering implementing any new feature, stop to ask, “What is the problem I am trying to solve?”