If I take the Users table from any Stack Overflow database, put an index on Reputation, and write a query to find the top 100 users sorted by reputation, descending:
1 2 3 4 5 |
CREATE INDEX Reputation ON dbo.Users (Reputation); SELECT TOP 100 * FROM dbo.Users ORDER BY Reputation DESC; |
It doesn’t matter whether the index is sorted ascending or descending. SQL Server goes to the end of the index and starts scanning backwards:
If you right-click on the Index Scan and go into Properties, you can see that the data is ordered, and SQL Server is scanning backwards:
You don’t need a separate descending index for that.
But if you sort multiple fields ASC/DESC/ASC, it gets complicated.
Say we’re looking for the highest-reputation people who live in London, sorted by reputation descending, and in the event of a tie, we want them listed in alphabetical order. Here’s an index we might build, plus the query:
1 2 3 4 5 6 7 |
CREATE INDEX Location_Reputation_DisplayName ON dbo.Users(Location, Reputation, DisplayName); SELECT * FROM dbo.Users WHERE Location = N'London, United Kingdom' ORDER BY Reputation DESC, DisplayName; |
In true Clippy style, SQL Server is recommending an index on Location – but with the rest of the columns just included in the index, not even sorted. Good times.
Ah, Clippy, good times. Stay in school, buddy.
We’re getting the sort because our data is kinda sorted, but not sorted enough, and if you hover your mouse over the Sort and look at the bottom of the tooltip, you’ll see that SQL Server is sorting by Reputation descending, DisplayName ascending.
To understand why, think about how the data is arranged when we seek to London, go to the highest reputation, and start reading backwards. Here’s a visualization query to see what’s in the index:
1 2 3 4 |
SELECT Location, Reputation, DisplayName FROM dbo.Users WHERE Location = N'London, United Kingdom' ORDER BY Location, Reputation, DisplayName; |
To simulate a backwards scan, go to the end of the result set and start reading upwards. At first, it looks like the data is perfectly sorted, but as you continue to scan backwards and start hitting ties, we have a problem:
If you’re reading from the bottom up:
- Row 6752: you read the first 1140, tsvallender
- Row 6751: you read another 1140, and you realize that the data’s not in order
- You could in theory now jump back down and re-read 6752, and now you have the 1140s in order, but…how do you know that 6751 was the last 1140? You’d have to look up at row 6750
- Row 6750: you read this, and it’s the first 1138, but
- Row 6749: he’s also 1138, so you have to keep reading upwards, and…
That’s gonna get old. It’s too much random jumping around, and it’s not a scan anymore, so rather than doing that dancing during the reading, SQL Server just says, “I’m gonna add a sort to the execution plan because the data isn’t ordered the way I need it to be ordered.”
We could fix that with a DESC index.
But it can’t just be EVERYTHING descending. The sort order has to match our query’s order, like this:
1 2 3 4 5 6 7 |
CREATE INDEX Location_Reputation_DESC_DisplayName ON dbo.Users(Location, Reputation DESC, DisplayName); SELECT * FROM dbo.Users WHERE Location = N'London, United Kingdom' ORDER BY Reputation DESC, DisplayName; |
So now our execution plan doesn’t have a sort or a memory grant:
Thing is, though, I almost never need to use this trick. Most of the time, the sort in the query plan just isn’t that expensive – like in this case, if you repeatedly compare the two queries, we’re talking about very small differences in memory grants and CPU consumption. The difference grows as the volume of sorted data grows, like if we’re talking about bringing back millions of rows, or if the query frequency grows, like if we’re running the query thousands of times per second.
Want to learn more tricks like this?
If you like tricks like this, you’ll love my Mastering Index Tuning class. The next one starts December 8th (iCal), and after that, Feb 12-14 (iCal.)
Folks with a Live Class Season Pass are welcome to drop in anytime, or just watch the Instant Replays on a schedule that works for them. Just head to my current training page, check out the schedule at the bottom of the page, and grab the calendar files for the classes you’re interested in. You don’t have to register ahead of time – just drop in anytime I’m streaming.
See you in class!
4 Comments. Leave new
I believe that at least in previous versions of SQL Server, reverse-order scans could not go parallel, so that would be a good use case for sorting an index descending.
Not sure if this is an issue in 2019, though. Not on a computer right now. 🙂
Yeah, I just tend to think that if you’re scanning so much data off an index that you need to go parallel, and that the index scan can only go in one direction (backwards), then you probably have bigger concerns.
Hi Brent,
It is me, or the query on the third screenshot (backwards_with_sort) does not match the query in the execution plan.
Nonetheless one of the weirdest examples (in my opinion) when SQL Server goes crazy because of ORDER BY is this:
https://dba.stackexchange.com/questions/10215/inclusion-of-order-by-on-query-that-returns-no-rows-drastically-affects-performa
Have you seen it before?
Cheers
Maciej – sorry about that, the query in the screenshot was irrelevant and I’ve removed it.
I took a quick look at the Stack question and it looks like you’ve got great answers on there, so I didn’t dig deeper. I’d love to, but I’m pretty busy with work right now. Thanks though!