Query Exercise: What Makes SELECT TOP 1 or SELECT MAX Different?

Last Updated August 16, 2024

These two queries both get the same answer from the Stack Overflow database:

SELECT TOP 1 LastAccessDate
    FROM dbo.Users
    ORDER BY LastAccessDate DESC;

SELECT MAX(LastAccessDate)
    FROM dbo.Users;

1

2

3

4

5

6

SELECT TOP 1 LastAccessDate

FROM dbo.Users

ORDER BY LastAccessDate DESC;

SELECT MAX(LastAccessDate)

FROM dbo.Users;

But do they go about their work the same way? As it turns out, no – even with no indexes at all, the two queries get different execution plans, and one of them is faster than the other:

On my particular server, with my server-level settings, database settings, and table structure, MAX runs faster and uses about 15% less CPU time to accomplish the same result.

But read these instructions carefully, because this week’s Query Exercise is not about making them faster! I wanna emphasize that really clearly because I know I’m gonna get a bunch of people who yell “create an index” in the comments.

Your challenge is to find things that will change their performance, things that will cause them to produce even more different execution plans. I’ll give you an example: let’s create an index that does not lead with LastAccessDate, and then try the queries again:

CREATE INDEX Location_LastAccessDate
    ON dbo.Users(Location, LastAccessDate);

1 2	CREATE INDEX Location_LastAccessDate ON dbo.Users(Location, LastAccessDate);

The new actual execution plans:

TOP 1 uses a sort, burns 3.3 seconds of CPU time, runs in <1 second, and goes parallel.

MAX uses a stream aggregate, burns 1.5 seconds of CPU time, but stays single-threaded, so it takes longer.

If you wanted to call a winner by lower CPU time, it’d be the MAX, but if you wanted lower execution time, it’d be the TOP 1. Of course, you could change those queries further in order to get a different winner, but I just wanted to start with an example to show you the kinds of influencing factors we’re looking for.

Again, your challenge is to find things that change their performance. I’m doing this because I saw someone say online that MAX is always faster, and that’s just not even close to the truth. There are many, many factors involved in affecting whether TOP 1 or MAX is best for a particular use, and your challenge this week is to find as many of those factors as you can. Absolutely anything is on the table! You can change the server settings, database settings, the table itself, and the query.

Put your queries in a Github Gist and their actual query plans in PasteThePlan, and include that link in your comments. Check out the solutions from other folks, and compare and contrast your work. Then, read my thoughts in the answer post. Have fun!

Dew Drop – Late Edition – August 7, 2024 (#4240) – Morning Dew by Alvin Ashcraft
August 8, 2024 1:26 am

[…] Query Exercise: What Makes SELECT TOP 1 or SELECT MAX Different? (Brent Ozar) […]

Reply
Jay Harper
August 8, 2024 7:37 am

If I recall correctly (24 years ago!), we had an issue with a trading system that was using a top 10 to push results to the GUI quickly and because it wasn’t using an index the top 10 ended up being kind of random. Some users had different records than others. Some had the same. I sorted it by fixing a particular index to the query.

Reply
- Vedran Galic
  August 8, 2024 10:59 am
  
  To ensure guaranteed sorting, you need to use ORDER BY.
  
  Reply
  - Brent Ozar
    August 8, 2024 11:00 am
    
    Bingo! You got it. That was the problem Jay was hitting, I guarantee it.
    
    Reply
    - Jay Harper
      August 9, 2024 8:14 am
      
      Yep probably so. 24 years ago! 😀
      
      Reply
Montro1981
August 8, 2024 9:23 am

I don’t have access to a SO database right now, I might have time to append my post later this week.

It is funny, we got asked a similar question a while back on the SQL Discord server.

I’m going to post a link to the discord server:
https://discord.com/channels/698642420497383504/723304166902202428/1247279961610653808

The question was:
“Hello, i realised that MAX(field_name) is the same as doing the following:

SELECT TOP 1 field_name
FROM table_name
ORDER BY field_name DESC”

The answers given were:
1. For a field without an index the MAX() is cheaper because the Stream aggregate (Aggregate) operator is cheaper than the Sort (Top N Sort) operator used in the TOP 1 Column ORDER BY Column DESC.
2. For a field with a Unique Index (like a Primary key) the plans are almost equal – the MAX() query sneaks in a Stream aggregate (Aggregate) operator for 1 row.

On a larger table the TOP 1 Column ORDER BY Column DESC, might even go parallel.

So, while the results can be the same, they may result in vastly different performance and resource usage characteristics, mostly depending on your indexing situation.
This can be a fun game to play. ?

~~~

As shown by Brent indexes can affect the performance of the query.
On a non indexed table the size of the table affects performance as well, a Top N Sort on a small table will have little to non impact as reading to whole table for the sort doesn’t take a lot of time, while reading a large table will take a lot of time.

Reply
- Brent Ozar
  August 8, 2024 2:05 pm
  
  Yep, you’re playing the game! And it is indeed fun to play – the more you look at different things involved with the query, the more blown away you’ll be at the number of factors that affect it.
  
  Reply
Michael McCormick
August 8, 2024 2:54 pm

Using StackOverFlow2013 at Compat level 160, I converted dbo.Users to a Memory Optimized table (after first removing a pesky IsValidUrl column). Used Nonclustered HASH for the Id index, and Nonclustered for the Reputation index. Added no other indexes.
The plan, https://www.brentozar.com/pastetheplan/?id=Bkn7zIfcR , looked identical to the original posted, but significantly reduced the Cost by 75%, and favored the MAX() query even more.
Also, both queries went parallel when, for me, neither did originally. Even though their cost was far over the threshold.

Reply
- Brent Ozar
  August 8, 2024 2:58 pm
  
  Neat idea on trying a memory-optimized table! In the query plans, doesn’t look like the Reputation index is used at all, eh? I’m guessing you haven’t restored the Stack Overflow database after running a few experiments over the weeks – I’d do that just to start a clean slate for each new round of experiments. But neat idea though!
  
  Reply
Query Exercise Answer: What Makes SELECT TOP 1 or SELECT MAX Different? - Brent Ozar Unlimited®
August 14, 2024 1:15 pm

[…] heard advice that SELECT MAX is faster than SELECT TOP 1, and that’s not quite true. I asked you to find factors that would cause these two queries to get different execution […]

Reply