When you index a nullable field, are the rows with nulls stored in the index? It’s easy enough to find out by creating a table with a nullable field, and then creating an index on it:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
CREATE TABLE dbo.Employees ( ID INT IDENTITY(1, 1) PRIMARY KEY CLUSTERED, EmployeeName VARCHAR(50) NULL ); GO INSERT INTO dbo.Employees(EmployeeName) VALUES (NULL); GO CREATE INDEX IX_EmployeeName ON dbo.Employees (EmployeeName); GO /* Then get the execution plan for this: */ SELECT * FROM dbo.Employees WHERE EmployeeName IS NULL; GO |
Yep, SQL Server stores the nulls in the index. That makes sense, really, because sooner or later, you’re going to want to find the rows with nulls, and when you’ve only got a few nulls in a big table, the index can help a lot there.
To learn more about this stuff in action, watch my free 90-minute course, How to Think Like the SQL Server Engine.
13 Comments. Leave new
SQL Server can also use and index for COUNT operations rather than the table, so it is critical that both have the same number of rows.
Your execution plans include the number of rows and time. Where is the setting to turn that on?
Aimee – that’s in the new SQL Server Management Studio 18. It’s only in preview phase, but it’s publicly available, and preview 7 has been pretty good quality for me.
If you’re still on SSMS 17, it does includes counts (but not times) in the Live Query Statistics. This may be a decent substitute.
Interestingly enough, Oracle made the opposite choice and does not store NULLs in the indexes (requiring some ugly tricks with indexing the results of a function if you’d like to find them).
What about SPARSE columns?
I haven’t ever used those, sorry! Not a popular feature there.
What if over 90% of a nullable field is eventually filled with NULLs and the slow statement searches using WHERE field IS NULL? (amongst other fields in the non-clustered index)
Jonathan – I’ll rephrase your question as, “What if I’m looking for 90% of a table?”
Yes, yes that would be slow, yes.
Yes, good point. What I was wondering is whether to leave out those values completely from the index, even if they are used in a JOIN or WHERE clause. Or should the field be moved to the INCLUDE part of the index?
Jonathan – that kind of design question is a little beyond what I can do in a blog post comment, but it’s a great sign that you’re ready for my Fundamentals of Index Tuning online class: https://www.brentozar.com/product/fundamentals-of-index-tuning-2/
I have an Index where all the columns are null. How is the index used?
That’s not really the scope of this blog post, but it’s a neat idea for a future blog post.