I see a lot of indexes
Most of them are pretty vanilla. Some are pretty memorable, like one that I saw with 93 INCLUDES. A lot of things that people could do with indexes get overlooked, either because they don’t know about them, or they’re not sure when they’ll help. So, here goes my attempt at shedding light on some of my favorite indexing tricks. This is by no means exhaustive, just a few pointers in the right direction.
Uniqueness is one of the most important pieces of information you can give to the optimizer. If you have faith in your data, I highly recommend testing this out. It really helps with row estimates. Imagine you have a query that wants IDs 1-100. It’s much easier for the optimizer to come up with a good plan if it knows it will only see each of those IDs once. The impact gets bigger as the number of rows you’re hitting grows. An additional upside is that you can use columns in unique indexes in foreign keys. This can be especially helpful if you’re working with hierarchies. To create a unique index, the syntax is simple. Of course, you can use Included columns, and you can also create them as…
CREATE UNIQUE NONCLUSTERED INDEX ix_happydays ON dbo.Table (column(s)... )
Are you one of those crazy people who doesn’t write SELECT * FROM dbo.Table queries constantly? Do you ever use those crazy little things called WHERE clauses? Filtered indexes can really help! Sometimes! There are rules, of course. You can’t use certain operators, like BETWEEN, but you can use all the equalities and inequalities (=, >, >=, <, <=, <>). You can’t use NOT IN or CASE. You also can’t use OR, but you can use IN. Weird, right?
And no, you can’t use date functions to have constantly shifting 30/60/90 day windows. Not allowed. If you want to do that, you’ll either have to drop and re-create the index with the correct window, or put the filtered index on a BIT column that you flip to 1/0 with an UPDATE to capture the window you want. Then you filter on the BIT column rather than the date column.
Also, if you use any non-default ANSI SET options, you may see failing INSERT/DELETE statements. Which ones?
- SET ANSI_NULLS ON
- SET ANSI_PADDING ON
- SET ANSI_WARNINGS ON
- SET ARITHABORT ON
- SET CONCAT_NULL_YIELDS_NULL ON
- SET QUOTED_IDENTIFIER ON
- SET NUMERIC_ROUNDABORT OFF
Creating filtered indexes is really simple, but you’ll definitely to at least put whichever column(s) you’re filtering on as INCLUDEs. It can give SQL an extra leg up when searching within the filter definition. Boy, this list keeps going downhill. It’s…
CREATE INDEX ix_VirginiaSlims ON dbo.Table (column(s)...) INCLUDE (filtercol) WHERE filtercol some expression
Man, sorting sucks. Sorting really sucks. I hate it. Not like, within a query necessarily, but at the end. Whether it’s via ORDER BY or OFFSET/FETCH. I just get annoyed when I see it. Even worse, I get annoyed when I see expensive SORT operators. One thing a lot of people don’t realize is that SQL hates sorting, too. That’s why it makes it so miserably expensive! But you crazy kids rarely pay attention to reverse incentives, so let’s talk about this situation, and how indexes can help.
If your order by calls for DESC, and you want to make it as painless as possible for SQL Server, that’s how you should define your index. One sort of tricky bit here is that when you index to support sorts, you really have to put the ordering columns all the way left in your index definition. You’ll have to consider any JOIN or WHERE predicates second. It can absolutely be worth it, especially if sorts are routinely spilling to tempdb. Sorting strings is especially painful, I can’t figure out why you’d want that even if I had a…
CREATE INDEX ix_downwardspiral ON dbo.Table (SortaKinda DESC)
These have rules too! Boo! Rules! The same rules as filtered indexes for ANSI SET options, actually. There are also rules about when they can be PERSISTED. This is a cool feature, because you don’t need to compute the column at query time. It will get computed and stored on disk like every other column. The flip side of this is that it can slow down table modifications if your computation is expensive.
In short, a computed column can’t be persisted if the output would change between runs. The simplest example is GETDATE(). You can’t persist a computed column with GETDATE() in it because it will change from run to run. Keeping it up to date at the storage level would be miserable. You can’t modify a computed column directly, because that wouldn’t make any sense. It’s computed. That means you don’t compute it.
But they’re cool in that they can be used in all kinds of constraints! PK, NOT NULL, and CHECK, to be specific. But this is about indexes, and some types of computed columns can be indexed for even better performance gains. Lucky you! Here are some examples of what can be added as a persisted column, and what can be indexed.
ALTER TABLE dbo.SalesOrders ADD F1 AS DATEADD(DAY, 1, OrderDate) /*Works*/
ALTER TABLE dbo.SalesOrders ADD F2 AS DATEADD(DAY, 1, GETDATE()) /*Works*/
ALTER TABLE dbo.SalesOrders ADD F3 AS DATEADD(DAY, 1, OrderDate) PERSISTED /*Works*/
ALTER TABLE dbo.SalesOrders ADD F4 AS DATEADD(DAY, 1, GETDATE()) PERSISTED /*Fails*/
CREATE INDEX ix_filter1 ON dbo.SalesOrders (F1) /*Works*/
CREATE INDEX ix_filter2 ON dbo.SalesOrders (F2) /*Fails*/
CREATE INDEX ix_filter3 ON dbo.SalesOrders (F3) /*Works*/
CREATE INDEX ix_filter4 ON dbo.SalesOrders (F4) /*Fails because the column add failed*/
One thing to be extra careful about is using scalar UDFs in computed columns, which used to be a popular way to break out…
You’re all using XML. All of you. And you’re doing crazy things with it. You’re using XQuery like it’s going… Well, it is going out of style.
All the cool kids use JSON now. Didn’t you get the invitation to the party? No? Weird. Like you.
XML indexes can help make things faster! But there are, of course, more rules. Rules you must follow, lest you be henpecked to death by red text in SSMS. The table must have a Primary Key on it. The Primary XML index demands it. Which brings us to the next rule! You need to define a Primary XML index! There are also Secondary XML indexes, which can be created by PATH, VALUE, and PROPERTY. As you can guess, they all have different use cases. I’ve never been able to nail down a use case for PROPERTY, but PATH and VALUE can help with the .exist() method, depending on how you’re querying the XML.
And of course, if you’re on SQL Server 2012+, you can use Selective XML indexes! These don’t require a Primary XML index, but do require you to specify all the paths/values/nodes you want to index. The benefit is that they’re much smaller than other XML indexes, and from the little I’ve gotten to play with them, a lot faster, too. If you’re interested, I suggest you set aside plenty of time to figure out all the syntax possibilities, as well as the limitations and guidelines. I’ll follow up with a post about them in the near future.
So how are you doing?
Are you using indexes to their full potential, or are you just creating the same ol’ same ol’? Don’t get me wrong, good same ol’ same ol’ indexing can work wonders, but sometimes you need to reach a little deeper to correct a problem.
I left out ColumnStore and the different Compressed indexes here because they’re Enterprise only, not because I don’t like either as a feature. ColumnStore especially has come a long way in 2016.
Thanks for reading!
Brent says: indexes are the single biggest technique to make your database faster with as little code change and expense as possible. You can go an incredibly long way just by working with indexes, and when you think you’ve exhausted your possibilities, there’s almost always more new tricks to discover.