The Many Mysteries of Merge Joins

Last Updated April 8, 2018

Not A Single Picture Of A Zipper

Image humbly borrowed from https://70srichard.wordpress.com/2014/12/17/beverly-hills-cop/

There are some interesting things about Merge Joins, and Merge Join plans that I figured I’d blog about.

Merge joins have at least one interesting attribute, and may add some weird stuff to your query plans.

It’s not that I think they’re bad, but they can be tricky.

Lots of people see a Merge Join and are somewhere between grateful (that it’s not a Hash Join) and curious (as to why it’s not Nested Loops).

Oh, you exotic Merge Join.

E pluribus unum

In a “good” Merge Join, the join operator in the query plan will have the Many to Many: False attribute.

The optimizer knows this because the Primary Key on Id (though a unique index or constraint offers similar assurances) is distinct for each value in the Users table.

Having one unique input give you a one to many Merge Join.

The statistics TIME and IO profile for this query is about like so:

Table 'Posts'. Scan count 1, logical reads 2282, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Users'. Scan count 1, logical reads 417, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

 SQL Server Execution Times:
   CPU time = 329 ms,  elapsed time = 333 ms.

Table 'Posts'. Scan count 1, logical reads 2282, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Table 'Users'. Scan count 1, logical reads 417, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

SQL Server Execution Times:

CPU time = 329 ms, elapsed time = 333 ms.

Not too shabby for one meeeeeeeeeeeeeeeeeeeellion rows.

E pluribus pluribus

In a “bad” Merge Join, that attribute will be True.

Why does this happen, and why is it bad?

It happens most commonly when there are, or may be duplicates on both sides of the results. They can also happen when the outer input has duplicates, but quite often the optimizer will rewrite the JOIN order to put it on the inside to avoid having to use a worktable, etc. Thanks to Adam and Kalen for nudging me in the comments to clarify this part.

Internally, the Merge Join will spin up a work table (similar to how a Hash Join operates), and work out the duplicate situation.

The stats TIME and IO profile of this plan looks like so:

Table 'Worktable'. Scan count 1, logical reads 28836, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Posts'. Scan count 1, logical reads 38, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Votes'. Scan count 1, logical reads 5, physical reads 1, read-ahead reads 464, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

 SQL Server Execution Times:
   CPU time = 343 ms,  elapsed time = 366 ms.

Table 'Worktable'. Scan count 1, logical reads 28836, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Table 'Posts'. Scan count 1, logical reads 38, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Table 'Votes'. Scan count 1, logical reads 5, physical reads 1, read-ahead reads 464, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

SQL Server Execution Times:

CPU time = 343 ms, elapsed time = 366 ms.

If we were to, say, choose that serial Merge Join plan in the compilation of a stored procedure that became the victim of parameter sniffing, we could run into trouble.

Table 'Worktable'. Scan count 505697, logical reads 67579461, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Posts'. Scan count 1, logical reads 62710, physical reads 0, read-ahead reads 55333, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Votes'. Scan count 1, logical reads 15112, physical reads 0, read-ahead reads 14640, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

 SQL Server Execution Times:
   CPU time = 459079 ms,  elapsed time = 459432 ms.

Table 'Worktable'. Scan count 505697, logical reads 67579461, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Table 'Posts'. Scan count 1, logical reads 62710, physical reads 0, read-ahead reads 55333, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Table 'Votes'. Scan count 1, logical reads 15112, physical reads 0, read-ahead reads 14640, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

SQL Server Execution Times:

CPU time = 459079 ms, elapsed time = 459432 ms.

Yes, that is seven minutes and forty seconds. A little over half of one metric cigarette break.

Head to head, the large merge (many to many) is costed higher than the smaller merge (one to many). But you won’t see that in a parameter sniffing situation.

You’ll only see the lower costed Merge Join version of the plan.

The optimizer will sometimes to try protect itself from such hi jinks.

Aggregation Ruling The Nation

In some cases, the optimizer may inject an aggregation into one side of the join ahead of time to distinctify the data. It doesn’t need to do both — we only need one distinct input for the many to many attribute to be false.

I haven’t seen a situation where both inputs get aggregated merely to support the Merge Join, but it might happen if you ask for other aggregations on the join column.

It can use a Stream Aggregate, which would generally make more sense, since both the Stream Aggregate and the Merge join require sorted data.

Under less reasonable circumstances, you may get a Hash Match Aggregate. This plan has the additional misfortune of needing to re-order the data for the Merge Join. Teehee.

If you see this, something has truly gone wrong in your life. This query was heavily under the hint-fluence.

A Sort Is A Sort

Much more common is seeing a Sort injected into a plan to support one or more downstream operators that require sorted input (Merge Joins, Stream Aggregates, Windowing Functions).

For instance, a query like this:

WITH Votes
    AS (   SELECT   v.UserId,
                    ROW_NUMBER() OVER ( PARTITION BY v.UserId
                                        ORDER BY v.Id ) AS rn
           FROM     dbo.Votes AS v )
SELECT  COUNT(*) AS records
FROM    dbo.Posts AS p
JOIN    Votes AS v
    ON p.OwnerUserId = v.UserId
WHERE   v.rn = 1;

WITH Votes

AS ( SELECT v.UserId,

ROW_NUMBER() OVER ( PARTITION BY v.UserId

ORDER BY v.Id ) AS rn

FROM dbo.Votes AS v )

SELECT COUNT(*) AS records

FROM dbo.Posts AS p

JOIN Votes AS v

ON p.OwnerUserId = v.UserId

WHERE v.rn = 1;

May give you a plan like this:

In this case, the Sort happens early on to support the Window Function. It also aids the Stream Aggregate, but whatever. Once data is sorted, the optimizer tends to not un-sort it.

The Sort in the next example will be needed with no index on, or where the join column is not the leading column in the index (there’s an If here, which we’ll get to).

If this is my index on the Votes table, data is sorted by PostId, and then UserId

CREATE INDEX ix_votes_uid ON dbo.Votes (PostId, UserId);

1	CREATE INDEX ix_votes_uid ON dbo.Votes (PostId, UserId);

When my query is just a simple join, like this:

FROM    dbo.Votes AS v
JOIN    dbo.Posts AS p
    ON v.UserId = p.OwnerUserId
ORDER BY p.OwnerUserId;

FROM dbo.Votes AS v

JOIN dbo.Posts AS p

ON v.UserId = p.OwnerUserId

ORDER BY p.OwnerUserId;

My query plan will look like this, with a big honkin’ Sort in it:

On the other hand, if my query looks like this:

FROM    dbo.Votes AS v
JOIN    dbo.Posts AS p
    ON v.UserId = p.OwnerUserId
WHERE v.PostId = 11227809
ORDER BY p.OwnerUserId

FROM dbo.Votes AS v

JOIN dbo.Posts AS p

ON v.UserId = p.OwnerUserId

WHERE v.PostId = 11227809

ORDER BY p.OwnerUserId

My WHERE clause filters the leading column to a single PostId (one Post can be voted on by many Users), the UserId column will already be sorted for that single PostId value.

We won’t need to physically sort data coming out of that.

Out In The Street, They Call It Merge Join

I hope you learned some stuff that you can use when troubleshooting, or trying to understand a Merge Join plan.

This is one of many places that the optimizer may inject a Sort into a plan that you didn’t ask for.

Thanks for reading!

ZIPPER FREE!

Building SQL ConstantCare®: The Minimum Viable Product (MVP)

Building SQL ConstantCare®: 10% of you have Priority Boost on.

6 Comments. Leave new

Kalen Delaney
April 4, 2018 1:48 pm

In the section right after the graphic showing Many to Many = True, you say “t happens when there are, or may be duplicates in one side of the results.” I would think duplicates on one side would be One to Many and not a problem, as in the first example. I would thing Many to Many = True would happen when there are duplicates in BOTH sides.

Reply
- Erik Darling
  April 4, 2018 2:13 pm
  
  Hi Kalen – yes you’re right, I’ll correct that. Thanks.
  
  Reply
  - Adam Machanic
    April 5, 2018 2:14 pm
    
    No, that’s not correct. Many-to-many only depends upon whether or not there are duplicates on the OUTER side.
    
    —
    DECLARE @i TABLE(y INT)
    INSERT @i SELECT 1 UNION ALL SELECT 1
    
    SELECT
    *
    FROM @i AS x
    INNER MERGE JOIN
    (
    SELECT DISTINCT y FROM @i
    ) AS z (y) ON
    z.y = x.y
    —
    
    Reply
    - Erik Darling
      April 5, 2018 5:03 pm
      
      Adam – yeah, but I’ve noticed the optimizer will flip join order a lot of times. The perils of responding from vacation. I’ll rework this section when I get home! Thanks!
      
      Reply
      - Adam Machanic
        April 5, 2018 5:20 pm
        
        Yes, the optimizer will flip the join order as it sees fit. I think your initial phrasing as criticized by Kalen was mostly correct from a logical perspective — but probably better to say something like “it may occur … if at least one of the two input sets is not guaranteed unique.” From a tuning perspective it is especially important to understand that it’s only the outer input that matters and putting a join hint or other query construct in place to force the order to change can effect a fix.
👉 Le JOIN in SQL: un approfondimento completo
July 4, 2025 10:12 am

[…] In questo articolo trovi un approfondimento su un particolare algoritmo di JOIN utilizzato da SQL Server per migliorare le performance delle query https://www.brentozar.com/archive/2018/04/many-mysteries-merge-joins/ […]

Reply

The Many Mysteries of Merge Joins

Not A Single Picture Of A Zipper

E pluribus unum

E pluribus pluribus

Aggregation Ruling The Nation

A Sort Is A Sort

Out In The Street, They Call It Merge Join

Related

6 Comments. Leave new

Leave a Reply Cancel reply

Hi! I’m Brent Ozar.

The Many Mysteries of Merge Joins

Not A Single Picture Of A Zipper

E pluribus unum

E pluribus pluribus

Aggregation Ruling The Nation

A Sort Is A Sort

Out In The Street, They Call It Merge Join

Related

6 Comments. Leave new

Leave a Reply Cancel reply

Hi! I’m Brent Ozar.

Get Free SQL Stuff