If You Can’t Index It, It’s Probably Not SARGable

Thumbs Rule OK?

When writing queries, it’s really common to want to take shortcuts to express some logic.

I know, it “works fine”, and the results are “right”, and you have “more important” things to do.

But take a minute here for future you.

Just think of reading this post as a continuation of the infinite thumb-scroll that your life has become.

And Examples!

For instance, it’s a lot faster for you to write ISNULL(somecol, 0) = 0 or YEAR(somedate) = 2018.

In some ways it may even seem intuitive to write queries that express simple equalities like this rather than somecol = 0 OR somecol IS NULL.

The problem is that this isn’t how indexes store, or have their data retrieved, and you can’t create indexes like this:

CREATE INDEX ix_nope ON dbo.zilch (ISNULL(nada, 0))

Hence the title: If you can’t index it, it’s probably not SARGable.

Practically?

I’ve tried to point this out with functions like ISNULL and DATEDIFF.

Using the DATEDIFF example, if you write a query that does this: WHERE DATEDIFF(DAY, day1, day2) = 90

And you’ve got an index like this: CREATE INDEX ix_dates ON dbo.orders (day1, day2)

Nothing about the index is tracking how many days apart day1 and day2 are. Nor minutes, nor hours, nor milliseconds.

It’s just data ordered first by day1, then by day2. Multi-column statistics also don’t track this. It’s up to you to define the data points you need to retrieve for your data.

The same goes for many other built in system functions, but there’s no warning sign on the doc pages that says stuff like:

  • “Hey, this is only here to make presenting data easier.”
  • “The optimizer can’t do that, Dave.”
  • “Only do this if you want to keep champagne flowing for consultants.”

It’s fairly well-documented what happens when you use these on-the-fly calculations as predicates: Nothing good.

For all the hand-wringing there is about index scans, there’s equal amounts of poorly written queries that have no hope of every doing a seek.

Mass Appeal!

As development teams mature and start writing queries beyond simple selects, they may add non-SARGable constructs in other places.

CTEs, Derived Tables, and non-indexed Views are easy examples. The backing query results aren’t materialized anywhere, so they’re not good candidates for predicates.

For example, this query:

Putting the COALESCE inside the CTE doesn’t magically make a new set of physical data — it’s just another expression. It’s really no different than if you did it directly in the WHERE clause without the CTE.

Going back to the title, you can’t index a CTE or a derived table. Abstracting expressions in there doesn’t persist them.

They won’t be SARGable here, either.

Unawares?

If you think they’re bad there, wait until you start trying to order data by expressions.

Even with a perfectly fine index thrown into the mix…

We’re forced to sort everything by hand.

Or whatever your computer uses.

Mine uses hands.

Tiny hands.

This isn’t your friend

This is the kind of query that really makes your CPU fans kick in.

The kicker is that the optimizer may decide to inject a sort into your plan that you didn’t ask for.

Is There An Exit?

When you’re writing queries to be reliably fast, take a close look at what you’re expecting the optimizer to do.

Indexes can go a long way, but they’re not cure-alls. They still have to contain data the way you’re trying to use the data.

If they don’t, you might need to look at temp tables, computed columns, denormalizing, or lookup/junction tables to set data up the way your queries use it.

Thanks for reading!

Previous Post
Azure SQL Managed Instances and Database Corruption
Next Post
Why Multiple Plans for One Query Are Bad

7 Comments. Leave new

Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.

Menu
{"cart_token":"","hash":"","cart_data":""}