Filtered Indexes and IS NOT NULL

Last Updated 7 years ago

Filtered indexes can be tricky. Just getting your queries to use the filtered index can be a real pain. Parameterization may mean it’s “unsafe” to use the filtered index. The optimizer might want the filtered column in your key or include column list when it doesn’t seem like it should have to be there. The weather might be cloudy.

But there’s one type of filtered index that everyone can love for a couple different reasons: IS NOT NULL.

“IS NOT NULL” filter for a selective query…

Here’s an example index. We’re using the StackOverflow sample database and creating the index only on Posts which are closed (a small subset):

CREATE INDEX ix_Posts_Score_ClosedDate_INCLUDES_FILTERED on dbo.Posts 
    (Score, FavoriteCount, ClosedDate) 
    INCLUDE (Id, Title, Tags, CommentCount, OwnerUserId, LastEditDate, LastEditorDisplayName)
    WHERE (ClosedDate IS NOT NULL);
GO

CREATE INDEX ix_Posts_Score_ClosedDate_INCLUDES_FILTERED on dbo.Posts

(Score, FavoriteCount, ClosedDate)

INCLUDE (Id, Title, Tags, CommentCount, OwnerUserId, LastEditDate, LastEditorDisplayName)

WHERE (ClosedDate IS NOT NULL);

Here’s a query looking for Posts with a ClosedDate in a given range. Note that the query does NOT say “ClosedDate is NOT NULL”, it’s just specifying a value range:

CREATE PROCEDURE #ClosedPostsByScoreAndFavorites
    @Start DATETIME,
    @End DATETIME,
    @Score INT,
    @FavoriteCount INT
AS
    SELECT
        p.Id,
        u.DisplayName,
        p.Title,
        p.Tags,
        p.CommentCount,
        p.FavoriteCount,
        p.Score,
        p.ClosedDate,
        p.LastEditDate,
        p.LastEditorDisplayName
    FROM dbo.Posts as p
    JOIN dbo.Users as u on p.OwnerUserId=u.Id
    WHERE 
        p.ClosedDate between @Start and @End
        and p.Score > @Score
        and p.FavoriteCount > @FavoriteCount
GO

CREATE PROCEDURE #ClosedPostsByScoreAndFavorites

@Start DATETIME,

@End DATETIME,

@Score INT,

@FavoriteCount INT

SELECT

p.Id,

u.DisplayName,

p.Title,

p.Tags,

p.CommentCount,

p.FavoriteCount,

p.Score,

p.ClosedDate,

p.LastEditDate,

p.LastEditorDisplayName

FROM dbo.Posts as p

JOIN dbo.Users as u on p.OwnerUserId=u.Id

WHERE

p.ClosedDate between @Start and @End

and p.Score > @Score

and p.FavoriteCount > @FavoriteCount

We give the query a run and look at the plan….

EXEC #ClosedPostsByScoreAndFavorites @Start='2011-01-01', @End='2014-01-01', @Score=2, @FavoriteCount=2;
GO

1 2	EXEC #ClosedPostsByScoreAndFavorites @Start='2011-01-01', @End='2014-01-01', @Score=2, @FavoriteCount=2; GO

Woooo, it matches my index!

The highlighted index is ix_Posts_Score_ClosedDate_INCLUDES_FILTERED

And now for my next trick… what if you need to guarantee uniqueness for a column that allows NULLS?

We’re entering controversial territory here, so I’m going to be careful. SQL Server treats NULL as a value. This is a really big deal to some people, and I totally understand if you want to vent in the comments about how that’s a terrible thing and it shouldn’t be that way, and SQL Server stole your pickup truck. But it’s the SQL Server we live in.

Treating NULLs as a value means that I can’t create a unique index or a unique constraint on a column that allows NULLs if it has more than one row with NULL in it. More than one NULL means I have duplicates.

A filtered unique index can get you around the problem, like this:

CREATE UNIQUE INDEX uq_UsersAllowNulls_DisplayName on dbo.UsersAllowNulls ( DisplayName )
    WHERE DisplayName IS NOT NULL;
GO

CREATE UNIQUE INDEX uq_UsersAllowNulls_DisplayName on dbo.UsersAllowNulls ( DisplayName )

WHERE DisplayName IS NOT NULL;

Yay for indexes!

We’ve got tons of material on indexes, start reading more over here.

Getting Started with Diskspd

22 Comments. Leave new

Laren Hagen
September 16, 2015 11:49 am

Re: NULL in a Unique Index – since ‘NULL’ means, in essence, ‘unknown’ even a single NULL value in a column that has any other value potentially violates a unique constraint. I’d likely be even more conservative than Microsoft here and ensure that no NULL values are permitted on a column where uniqueness must be guaranteed.

Your solution to create a filtered unique index is a great way to allow ambiguity of NULL values while still enforcing the business need of “value must be unique if known”

Reply
raulggonzalez
September 17, 2015 3:39 am

Great post! I’m a big fan of filtered indexes WHERE blabla IS NOT NULL as can save you lots of space in big tables, but I’ve never thought about enforcing uniqueness this way.

Thumbs up!

Reply
mmiike
September 17, 2015 3:40 am

Excellent article, and quite timely =) thank you.

I thought the truck-theft issue was resolved in 2016…or was it closed as “Won’t Fix”? Readers of a more heterodox bent may be reminded of a similar issue, which is when you’re not looking, and PostGres drinks all your beer.

Reply
TuneIn
September 17, 2015 4:21 am

Is there a reason, other than being able to easily run it with other parameter values, for embedding this query in a temporary stored procedure?

I believe this one may suffer from parameter sniffing, when entering bad values for the parameters.
It may not be likely, but what happens if you enter NULL values for @Start and @End?
I don’t have the StackOverflow database at hand.

Does it make sense to enter the Closed Date as last key column in the index?
I think it may be better to put it first, because apparently you typically search (for) a range.

It’s not clear from the SELECT statement, as there is no ORDER BY clause.

Reply
- Kendra Little
  September 17, 2015 8:21 am
  
  Parameterized queries can be particularly tricky to get to use filtered indexes, so the example here is just to show that a parameterized query can use this filtered index.
  
  For the question about “what happens”, go ahead and download the sample database and play with it!
  
  Reply
Joseph Hoetzl
September 17, 2015 10:21 am

So where do you stand on using filtered indexes “IS NULL” on partitioned tables? It is working well as a way to reduce index size, and force it to use only a specific partition, and make maintenance easier, but curious your thoughts.

We have some tables with an Archive Date column. Table is partitioned monthly for the main reason of behaving like OLTP in the primary partition, and OLAP in any archive partitions with different indexes. The column is set to NULL so the partition range will hold it in the primary, when that date is determined later, it is inserted and the row moves to the appropriate range, which is then switched out as needed.

Reply
- Kendra Little
  September 18, 2015 9:06 am
  
  It sounds like the filtered indexes are aligned and aren’t preventing partition switching, so that’s good!
  
  If you’re partitioned by ArchiveDate and that value gets updated, essentially moving the row is a delete-insert combination. That might eventually become a bottleneck, depending on your process for when you update those rows, how many are updated, and what the perf requirements are at that time.
  
  Reply
Matt G
September 17, 2015 8:09 pm

But… but… but… SQL Server *did* steal my pickup truck?

I don’t have a pickup truck now and I want one – so SQL must have stolen it!

🙂

Reply
Tim Cartwright
September 28, 2015 12:54 pm

Quick question:

Would you be able to foreign key to that unique constraint on the display name from another table?

Reply
- raulggonzalez
  September 28, 2015 1:26 pm
  
  I’d say not at all.
  
  True that a unique constraint create a unique index to enforce it, but that does not work the other way around. A unique index does not create a unique constraint.
  
  So to add a FK you need a constraint, not enough with a unique index, even less if that’s filtered.
  
  sounds pretty reasonable for me…
  
  Reply
  - Kendra Little
    September 28, 2015 1:35 pm
    
    A foreign key constraint can reference a unique index. I swear it’s somewhere in books online, but I’ve tested it as well. Testing the filtered thing real fast.
    
    Reply
    - raulggonzalez
      September 28, 2015 1:37 pm
      
      BOL:
      
      A foreign key constraint does not have to be linked only to a primary key constraint in another table; it can also be defined to reference the columns of a UNIQUE constraint in another table.
      
      It says UNIQUE constraint, not index… but testing it now.. 🙂
      
      Reply
      - Kendra Little
        September 28, 2015 1:43 pm
        
        Do test– it works against unique indexes. Used that trick more than once. Responding to the original comment because I tested the filtered scenario.
    - raulggonzalez
      September 28, 2015 1:43 pm
      
      Unique index is OK,
      
      FIltered returns
      
      Msg 1776, Level 16, State 0, Line 24
      There are no primary or candidate keys in the referenced table ‘dbo.Table2’ that match the referencing column list in the foreign key ‘FK_Table1_Table2’.
      Msg 1750, Level 16, State 0, Line 24
      Could not create constraint or index. See previous errors.
      
      …
      
      I didn’t know about the unique index, nice question!
      
      Reply
      - Kendra Little
        September 28, 2015 1:48 pm
        
        Yeah– I didn’t correct you to be pedantic, can be really useful since unique index allows includes 🙂
      - raulggonzalez
        September 28, 2015 1:57 pm
        
        I’d never think you’re correcting anyone to be pedantic.
        
        It’s actually a nice trick and now I won’t forget it for sure.
        
        Thx!
        
        Pd. Again BOL needs to another coat of paint…
- Kendra Little
  September 28, 2015 1:45 pm
  
  I tested and if you try to create a foreign key to a filtered unique index, you get this message:
  
  /*Msg 1776, Level 16, State 0, Line 19
  There are no primary or candidate keys in the referenced table ‘UsersAllowNulls’ that match the referencing column list in the foreign key ‘fk_Users_FK_SillyTest’.
  Msg 1750, Level 16, State 0, Line 19
  Could not create constraint or index. See previous errors.
  */
  
  As discussed in the rest of this thread, you can create foreign keys against (non-filtered) unique indexes. This can be useful in some cases because unique indexes only apply the uniqueness to the key columns, but let you add included columns for query support, whereas unique constraints do not support included columns.
  
  Reply
Snarky McSnarkSnark
October 4, 2015 9:17 pm

I had just discovered the “UNIQUE INDEX where the value is not NULL” trick a few weeks ago, and it’s allowed me to get rid of 5+ GB of indexes used to quickly locate unprocessed rows in a transactional database. It’s a fabulous feature, and it has improved the performance of several scheduled tasks rather dramatically.

However, I have found one issue resulting from it that I don’t fully understand, so was hoping you could help. A few stored procedures started to fail, because they ANSI NULLS setting had (evidently) been different in the SP’s session than they were when the index was created. Is there a workaround for this, or do I have to re-compile all stored procedures with explicit ANSI NULL options?

Reply
- Kendra Little
  October 5, 2015 8:36 am
  
  Everything that modifies the table will need to have the required SET options specified in the “filtered index” section here: https://msdn.microsoft.com/en-us/library/ms188783.aspx
  
  These settings have to have the same values in procedures and other code if you’re ever going to use indexed views or filtered computed columns as well.
  
  Reply
Dave Boltman
September 5, 2018 7:42 am

SQL Server is great, and didn’t steal my truck, until today. I want to create a unique index on *two* fields, and allow nulls. i.e. the unique index will be on FIeld1, Field2, and the index filter expression will be WHERE ((Field1 IS NOT NULL) OR (Field2 IS NOT NULL)). But of course the OR operator is not allowed in the filter expression. Why oh why!! AND is allowed, but not OR.

Wait – I left my truck on the driveway, but it not there now!~!@

Reply
Tony Matthews
October 2, 2018 11:45 am

Came across this and had some fun getting IS NOT NULL filtered indexes to work. I think it turns out to be an issue with ANSI_NULLS in SQL…

For example, take a large table with a sparsely populated column. What you need is a filtered index using IS NOT NULL. This creates a nice small index of the rows where the column is NOT NULL. You run this with a join and SQL just ignores the index!

Turns out the problem is ANSI_NULLS is an execution time setting, so SQL assumes it might have to join using NULL=NULL if someone sets ANSI_NULLS to OFF. The result is that it won’t use the filtered index.

The work around is to mirror the filter condition for the index in your query as well. SQL now starts using the index. I assume this happens with other values in the filter as well – so if you want to use a filtered index, add the filter condition to your queries as well.

Reply
Things I Wish I Knew When I Started » FLX SQL with Andy Levy
January 21, 2019 6:44 am

[…] of these aren’t technical “I wish I understood the nuances of filtered indexes” type of thing. Many of them are hard-learned lessons that, had I known these things, would […]

Reply