If your query doesn’t have an ORDER BY clause,
you can’t reliably predict the order of your results over time.
Sure, it’s going to look predictable at first, but down the road, as things change – the indexes, the table, the server’s configuration, the size of your data – you can end up with some ugly surprises.
Let’s start with something simple: we’ll do a SELECT from the Stack Overflow Users table. The clustering key on that table is Id, the identity number that starts at one and goes up to a bajillion. When we do this SELECT, the data comes back in the order of the clustered index:
But if someone creates an index on DisplayName and Location, then suddenly SQL Server will choose to use that index rather than the clustered index:
Here’s the execution plan as proof:
Why did SQL Server choose to use that index even though it didn’t need to have DisplayName and Location sorted? Because that index is the narrowest/smallest copy of the data that SQL Server needs to fetch. Let’s examine the index sizes with sp_BlitzIndex:
The clustered index of the table (CX/PK) has 8.9M rows, and it’s 1.1GB in size.
The nonclustered index on DisplayName/Location also has all 8.9M rows, but it’s only 368MB in size. If you have to scan the entire thing in order to get your query results, why not scan the smallest object, which will get done more quickly? And that’s exactly what SQL Server chose to do.
“Yeah, but my query has a WHERE clause.”
Okay, now that we’ve got an index on DisplayName, Location, try running a query that looks for a particular DisplayName. The results come back in DisplayName order:
And the execution plan uses the index:
But now if you try for a different username, they’re not sorted by name at all:
Because SQL Server decided that there are more Alexes, so it made more sense to do a clustered index scan rather than the seek + key lookup:
These are simple examples, and I could go on.
Even in these really simple cases, you can’t guarantee that SQL Server will always use the copy of the data that you expect. Over the last few weeks, I’ve encountered a lot more complex examples:
- Someone dropping an index that was used by a query
- Someone turned on Forced Parameterization, which changed SQL Server’s row estimates for a query plan, causing it to make different index choices
- Someone changed the Compatibility Level of a database, introducing a newer Cardinality Estimator, which resulted in a Different Plan
If you need the data to be ordered tomorrow, when you’re not looking, just put an ORDER BY in there. Your grandchildren will thank you when they have to troubleshoot your query.