People love to search.
Google has us addicted to fast, easy search functions. Users expect every application to have a built-in blazing-fast search functionality. To pull that off, developers build search queries that let users enter a string, and we ask SQL Server to find matches. For example, say our users need to find some nuts. We take their string, put percent signs on either side, and then search for that field in the product names:
WHERE [Name] LIKE '%nut%'
This works great on our machine – but when we scale it up to hundreds or thousands of simultaneous users, or if we ramp up the number of products, our response time sucks. Let’s find out why by checking the execution plan. In SQL Server Management Studio, click Query, Display Estimated Execution Plan, and we get:
There’s only one operation, which would sound like a good thing, but that one operation is a clustered index scan. That means SQL Server has to read every row out of the Product table, check to see whether it’s got “nut” anywhere in the name, and then return our results.
But wait – there’s an index on Production.Product for the Name field, and SQL Server will use that index if we have a slightly different query:
If we take the leading % wildcard out, notice that this time it does two operations. First, it does an index seek against our Name index, then it looks up the corresponding rows in the base table (Product). Two operations might sound more expensive, but if we looked at the total cost for each query, the second one would be faster in situations where we’re only pulling a small number of search results back. If this table was for a company called Nothin’ But Nuts, on the other hand, we would probably still need to scan the entire table, but that’s a discussion for another day.
So why doesn’t SQL Server use the index for the %nut% query? Pretend for a moment that you held a phone book in your hand, and I asked you to find everyone whose last name contains the letters HAM. You would have to scan every single page in the phone book, because the results would include things like:
If, on the other hand, I asked you to find everyone whose last name began with the letters HAM, you could turn straight to the H page in the phone book, scan through a few lines, and quickly get the information you need.
When I asked you for everyone beginning with the letters HAM, my query was sargable. When I asked you for all last names containing HAM anywhere in the name, my query was not sargable – meaning, you couldn’t leverage the indexes to do an index seek. (Yes, sargable is sort of a real word – it stems from the phrase Search Arguments.)
Since we can’t talk users out of using search queries, and since non-sargable queries don’t scale, we need to find a better way to search for text. That’s where SQL Server’s Full Text Search comes in. Unfortunately, your queries must change – normally when you add an index, your queries just magically become faster. Full text indexes don’t work that way. You have to tweak your queries to use operators like CONTAINS instead of LIKE. If you’re dealing with an old boat anchor of a database and the developers are long gone, you might be out of luck.
%String% is just one thing that will cause SQL Server to do slower scans instead of high-performing index seeks. To learn more about sargability, check out:
- Beginner-level: StackOverflow question: What makes a SQL statement sargable?
- Advanced-level: Rob Farley on SARGable functions
- Book: SQL Server 2008 Query Performance Tuning Distilled
- Implicit conversions are another common sargability problem. Read Jonathan Kehayias’ post on finding index scans due to implicit conversions, Tibor Karasazi’s post on the performance implications, Jonathan Kehayias’ post on implicit conversion performance issues, Craig Freedman’s post on errors caused by implicit conversions.