When you write a query, you (usually) don’t want to return all the rows from the table(s) involved – so you add a WHERE clause to the statement. This ensures that fewer rows are returned to the client – but doesn’t reduce the amount of I/O done to get the results.
When you create a nonclustered index, you can add a WHERE clause to reduce the number of rows that are stored at the leaf level – a filtered index. By having fewer rows in an index, less I/O is done when that index is used.
Filtered indexes are great performance boosts if you have a value that is used in a predicate very frequently, but that value is only a small amount of the total values for that table. (Say that ten times, fast.) For example: I have an orders table that contains a Status column. Valid statuses are Open, Processing, Packing, Shipping, Invoicing, Disputed, and Closed. When the business first starts using this table, there’s a good chance there is a fairly even distribution of orders across these statuses. However, over time, a majority of orders should be in Closed status – but the business wants to query for Open, or Disputed, which are only a small percentage.
This is where a filtered index can come in.
When you add a WHERE clause to the index creation statement, you’re limiting which rows are stored in the index. The index is smaller in size and more targeted. It can be trial and error to get the query optimizer to use the filtered index, but when you do, gains can be significant. In one case study I have, logical reads dropped from 88 to 4 – a 95% improvement! What are some of the things you can – and can’t – do with them?
What You Can do in a Filtered Index…
- Use equality or inequality operators, such as =, >=, <, and more in the WHERE clause.
- Use IN to create an index for a range of values. (This can support a query that does an “OR” – read about “OR” and “IN” with filtered indexes here.)
- Create multiple filtered indexes on one column. In my order status example, I could have an index WHERE Status = ‘Open’, and I could have another index WHERE Status = ‘Shipping’.
- Create a filtered index for all NOT NULL values – or all NULL values.
What You Can’t do in a Filtered Index…
- Create filtered indexes in SQL Server 2005.
- Use certain expressions, such as BETWEEN, NOT IN, or a CASE statement.
- Use date functions such as DATEADD for a rolling date range – the value in WHERE clause must be exact.
- The query optimizer won’t consider filtered indexes if you’re using local variables or parameterized SQL for the predicate that matches the filter. Jeremiah explains how dynamic SQL can help.
To learn more about indexes, check out our index category, or my Fundamentals of Index Tuning course.
“The query optimizer won’t consider filtered indexes if you’re using local variables or parameterized SQL” That’s not true in all cases. You just have to make it statically known that the filtered index can apply. Like this:
For a filtered index WHERE SomeCol > 1234 you can query it using:
DECLARE @var INT = …;
WHERE SomeCol = @var AND SomeCol > 1234
Realize this is an old list, but haven’t seen anyone discuss this aspect. I work with multi layer queries all the time. Have had some success by pushing parameterized filter variable into a subquery. For example if @PatientID identifies a patient I need to filter by and I need to use a filtered index on a large audit table that only indexes activity in recent years, Applying the paramater or a variable as in ‘WHERE tablename.patId = @PatientID’ doesn’t use the filtered index (as explained here) and resulted in an expensive table scan. But ‘WHERE tablename.patid in (Select patid from PatientMasterTable as p where p.patid = @PatientID)’ does seem to work. The result of what appears to be a silly query to get a value that is already known ends up converting the variable to discrete data. The two layer query finds the filtered index. Also works with table variable for multiple values.
Most likely your original query (before the subquery) gets trivial optimization rather than full optimization. We cover that in our Fundamentals of Query Tuning class and Mastering Index Tuning class.
Ahhh filtered indexes, how I love thee. I’m constantly amazed that so few people (I come across) know about them. Great writeup Jes.
Glad to see filtered indexes getting some much needed attention. One thing worth pointing out–and perhaps this is old hat to many readers–is that INSERTS and UPDATES to a table with a filtered index will *fail* (Msg 1934, Level 16, State 1, Line 1) unless they are made from a connection with the following options properly set:
set quoted_identifier on
set ansi_nulls on
those should be the default for most connections, but a word to the wise if you’re considering adding a filtered index to an application (i.e. 3rd party) where you don’t control the entire codebase.
i gotta think there’s some extended events magic one could setup to monitor one’s application for such bad behavior before deploying filtered indexes (i.e. throw an event when an inserts/updates occurs to a table of interest with either option OFF), but this has proved beyond my present knowledge of EE (nil), interest level, free time, and/or googling prowess.
if anybody out there has such a thing (or would be willing to check it out), that’d be righteous.
Mike, just want to reemphasize your point…I ran into this issue yesterday. We had a…errr…problematic load query running that was running slower than we wanted, and in an effort to help it along, we tried a filtered index that -should- have made it much, much faster.
But of course, we immediately see the progress count stall, and go and look and sure enough the query failed (it included an UPDATE)…as soon as we added the filtered index. Scanning up through the query, there it was… “SET QUOTED_IDENTIFIER OFF”.
Luckily we were able to drop the index and restart without too much pain, but it was one of those somewhat obscure, irritating errors. I was at first wondering why SQL Server wouldn’t just ignore the index and not use it if incompatible with that setting, but I guess the update still had to hit that index anyway. So, Caveat Indexor, or however the expression goes.
I ran into the same issue: I created a simple filtered index “Status 2”. After that all inserts failed with “.INSERT failed because the following SET options have incorrect settings: ‘ANSI_NULLS, QUOTED_IDENTIFIER,CONCAT_NULL_YIELDS_NUL( ²Û”…..and all Updates failed with a similar error.
Can someone explain why this happened?
(…that is “Status not equal to 2″….but the site deleted the not-equal operator)
It’s a limitation from Microsoft – the same thing applies to indexed views.
One tricky case is if you create a Stored procedure with the wrong settings, then it doesn’t matter if you current connection has the correct settings the insert/update will be executed using the settings of “quoted_identifier and ansi_nulls” at the time the stored procedure was created.
Just to try to help out on this subject…
I’m facing a performance issue after creating a filtered index on a table that is 25 billion rows in size (and not yet partitioned – work in progress…). The index in question is on a ‘datemodif’ field that is YYYYMMDD format, varchar(8).
Thanks in advance Peter, I don’t know if your comment will help me find the root cause but I stumbled on this bit of info that can help identify any objects that have either of the above settings OFF when they were created.
SELECT * FROM sys.sql_modules
WHERE uses_quoted_identifier = 0 or uses_ansi_nulls = 0;
This showed me all procedures & triggers that had either setting OFF.
Just in case it can help someone in future.
Just wanted to point out that OR is not allowed in the grammar. So this clause:
WHERE Status = ‘Open’ OR Status = ‘Shipping’
could only be used if it were changed to:
WHERE Status IN (‘Open’,‘Shipping’)
Aaron, thanks for the clarification!
Create index doesn’t accept like clause such as
col like ‘ABC%’
but this can easily be re-written as
col >= ‘ABC’ and col < 'ABD'
Looks like you can use local variables if you have Option(recompile) on the query. It will take advantage of the filtered index.
SQL Server 2012
Iam trying to find a way to compare 2 columns i the same table in the filtered index, ie
setting the Where ValidFrom < ValidTo. Is it only premitted to have hardcoded values in the comparision?
Thanks for great site.
Why not give it a try and find out? I promise, SQL Server doesn’t bite.
“I thought you said SQL Server doesn’t bite!”
“That is not mySQL.”
Any suggestions for needing to filter on a CLR data type not being NULL? In my case, I want to ensure that within a given parent, you cannot have more than one child with the same name. (George, George, George, George….) but, only in cases where the hierarchy has been populated. The hierarchy field has to be Nullable because during initial creation I need to get the identity value of the record and its parent(s) to create the string for the hierarchy. Cousins with the same name are okay. Thanks!
Bryan — for q&a, head over to http://dba.stackexchange.com
Fair enough, let me rephrase.
An additional limitation for “What You Can’t do in a Filtered Index…” is filtering on a CLR datatype, such as hierarchyid. This extends to computed columns defined by a CLR data type, even when persisted, such as [HieararchyColumn].GetLevel().
The returned error in this case is “Msg 10619, Level 16, State 1, Line N
Filtered index (Index Name) cannot be created on table (Table Name) because the column (CLR data typed column) in the filter expression is of a CLR data type. Rewrite the filter expression so that it does not include this column.”
If you’ve got a way around it, let me know!
Bryan – for Q&A, go ahead and hit a Q&A site like https://dba.stackexchange.com. (We really don’t use CLR data types here.) Thanks for understanding!
Rebuild is required for Filtered Non-Clustered Indexes? .I’ve table where index created for a column with filter index as “Not null”.
Now the table records is close to 4 million out of which only 50k contains value rest all has null values.
Rebuiding this index take more than 2 hours.
Filtered nonclustered indexes are just like any other nonclustered index. When considering rebuilding them, make sure you’re actually going to solve a problem: https://groupby.org/conference-session-abstracts/why-defragmenting-your-indexes-isnt-helping/
where column1 is not null or column2 is not null
Arghh OR is forbidden
I had this issue. Solution : create two separate filtered indexes, one on each expression. Both indexes will be used and combined. Implementing this solution meant we didn’t have to run on business critical tier anymore and we saved loads of money.
[…] What You Can (and Can’t) Do With Filtered Indexes by Brent Ozar […]