Forcing Join Order Without Hints

SQL Server
18 Comments

Brent buys lunch for the ladies

The purpose of this post is to show a bit of syntax that often gets overlooked in favor of using query hints to force joins to occur in a particular order. We’ll start by creating three tables. One for employees, one for orders, and one for items in the order.

A SQL celebrity gossip blog got a tip that someone from BOU ordered take-out. Not exactly an earth-shattering event, but querying minds want to know!

So they write a query, and then they look at the plan.

And that’s way harsh. SQL went and changed our LEFT JOIN into an INNER JOIN. What was it thinking? Now we don’t know who Brent is having lunch with.

2015-04-28_14-20-22
How in the why in the heck did that LEFT JOIN turn into an INNER JOIN? SQL is full of it today.

Who ordered the KALE?

Okay, we thought about it some. No more INNER JOIN.
We’ll get this done with another LEFT JOIN.

Unless you’re Ernie, that’s wayyyyy too many Brents.

Too many Brents on the dance floor.
Too many Brents on the dance floor.

First of all, ew. But yeah, you can do this, and it will come back with the right results.

2015-04-28_15-04-31
WHAT SCALARS ARE YOU COMPUTING? WHAT FRESH HELL DID YOU SPAWN FROM?

We can even try an OUTER APPLY. That’s a little nicer looking as a query…

… But same yucky plan.

Hi, I’m a cool trick.

Nicer plan and a little less CPU than the others, on average.

Pleased to meetcha!
Pleased to meetcha!

This is an interesting concept to play with. With more than a couple JOINs, you can start using parentheses to group them together, like so:

If you switch the order of the ON clauses in the second to last query, you’ll get an error. Take a guess why in the comments!

Brent says: Okay, first off, I don’t normally drink an entire box of wine for lunch, but I had to wash down all that kale. Second off, judging by these execution plans, SQL Server intercepted my wine delivery.

Kendra says: I’ve found parentheses join hints twice in production code in SQL Server. In both cases, nobody knew why they were there, what would happen if they rewrote the query, or even if they’d been used on purpose or if it was just an accident. If you use this technique, document your code heavily as to what you’re doing and why, or you’ll be that person everyone grumbles about.

Previous Post
SQL Server Version Detection
Next Post
SQL Server 2016 Security Roadmap Session Notes #MSIgnite

18 Comments. Leave new

  • Hi,
    This doesn’t really detract from the point of the post but threw me for a loop when I was testing it on my dev box:
    In the first query, the join between #Lunch and #LunchOrders is on lo.LunchOrderID = l.LunchID
    But in all the subsequent queries, the join is on lo.LunchID = l.LunchID
    So the results are different, as the final query has all the ordered items against a single lunch as per the underlying data, rather than spread (erroneously) across all the individual lunches.
    Neat trick all the same
    ross

    Reply
    • Erik Darling
      May 6, 2015 9:58 am

      D’oh. I’ll fix it up in a bit. Thanks for pointing it out. The perils of coming back to a post two days later!

      Reply
  • Jason Elias
    May 6, 2015 11:01 am

    Slick, haven’t seen that before — as to why changing the JOIN order in the 2nd to last query might produce an error, I’d guess that SQL Server wants to process the JOIN between #Lunch and #LunchOrder before evaluating the JOIN between #Lunch and #Ozars (which is outside the parentheses). But that’s just a guess..

    You’re posts are hilarious to read, keep ’em coming.

    Reply
  • James Lupolt
    May 6, 2015 11:55 am

    I did not know about parentheses join hints!

    Do you if these are these guaranteed to enforce a join order, or is it something that the optimizer might choose to disregard in favour of a join order it expects to perform less work?

    Reply
    • Erik Darling
      May 7, 2015 9:08 am

      I’ve never seen it not do it, but I haven’t played with it enough to know for sure. And remember, you’re putting the optimizer in the corner here, so only use it when something is going completely awful and chasing you around the woods with a machete.

      Reply
    • No, this syntax does not *guarantee* the execution plan will follow the written join order. For that, you’d also need an OPTION (FORCE ORDER) hint – something not to be used lightly because it has many subtle side-effects not limited to join ordering.

      Reply
  • I would probably put the 2 lunch tables being inner joined in a table expression and avoid the awkward join syntax.

    Reply
  • To add to my comment about table expressions. Either of these will produce the correct results and avoid dangling an ON clause.

    WITH lo AS (
    SELECT l.OzarID, lo.*
    FROM #lunch l
    INNER JOIN #LunchOrders lo ON lo.LunchID = l.LunchID
    )
    SELECT o.*, lo.*
    FROM #Ozars o
    LEFT JOIN lo ON lo.ozarid= o.ozarid;

    SELECT o.*, lo.*
    FROM #Ozars o
    LEFT JOIN (
    SELECT l.OzarID, lo.*
    FROM #lunch l
    INNER JOIN #LunchOrders lo ON lo.LunchID = l.LunchID
    ) lo ON lo.ozarid= o.ozarid;

    Reply
  • Rob Mackenzie
    May 6, 2015 9:11 pm

    We use this pattern often. The key is always to remember that you don’t join tables, you join result sets. I think it helps greatly to simply wrap the inner join with the outer join (just a formatting change). But conceptually, you’re outer joining the result set of two inner joined tables.

    LEFT…
    INNER…
    ON…
    ON…

    Thanks for providing the internals to show what’s happening.

    Reply
    • Yup. That’s what I’ve been doing for years. Though I never thought of it that way, nor used that terminology. I just call them “nested joins”—like when people say “nested queries” (aka subqueries).

      The parenthesis are actually useless. You can’t change order of operations with them; you have to re-order the join clauses.

      One extremely important thing to note about nested joins is scoping of variables. While you can select/project everything from down at the deepest join, you are restricted what columns you can bind to at the various join levels.

      Reply
  • Rob Mackenzie
    May 6, 2015 9:13 pm

    (Spacing forced with underscores)
    LEFT…
    _____INNER…
    _____ON…
    ON…

    Reply
    • That is fairly readable. I’ve just developed a strong distaste for this style of join syntax after having had to deal with MS Access generated SQL in the past. In goes wild with parans and nested ONs.

      Reply
  • @Tony – Yes, they produce the results but also have the incorrect plan with all the scalars being computed when I run it.

    As a pure style point (and in blind obedience to Codd’s laws) I think I would have designed the tables differently and only had one #Lunch entry for Brent (after all he only ate one lunch, even if it was sub-hic-stantial), but even playing with that didn’t seem to make a difference.

    Disappointing that the OUTER APPLY and the CTE didn’t produce the right plan. That’s a fail in the optimizer for me.

    Thank-you for sharing this.

    Reply
  • This reminds me of the dumb joke: Try to find out 2 + 2 * 2 in calc.exe/physical calculator vs MS Excel — 6 or 8?

    OUTER and INNER JOIN operators both have the same precedence level, so the analogy is more like 2 – 2 + 2 and you do need parenthesis to “push” addition before subtraction.

    I always write outer joins *after* all inner joins and nowadays instead of LEFT JOIN (a JOIN b ON ..) ON .. always prefer LEFT JOIN a ON .. LEFT JOIN b ON .. with no parenthesis. This might produce sub-optimal query plan but readability is worth it.

    Reply
  • @Erik

    It’s the same as the classical Customer – OrderHeader – OrderLine.

    You aren’t actually using #Lunch records with LunchID 2 and 3. The #LunchOrders are all attached to LunchID 1

    Reply
  • Hey, this blew my mind! One thing I spotted (after showing everyone in the office this page) is that for this particular query you could have expressed it as a right outer join, e.g.

    SELECT o.*
    , l.*
    , lo.*
    FROM #LunchOrders lo
    INNER JOIN #Lunch l ON l.LunchID = lo.LunchID
    RIGHT OUTER JOIN #Ozars o ON o.OzarID = l.OzarID;

    This produces the exact same plan as your funky nested join syntax, which makes sense given that you’re using nesting to reverse the join order. There will be other, more complex, scenarios that aren’t suited to a right outer join (also, who uses right outer joins?!).

    Reply
  • JHF Remmelzwaal
    February 28, 2017 5:25 am

    I read you post with interest, but the point you want to make is not clear to me. You clam “Forcing Join Order Without Hints”, but then you start with a functional incorrect query, you do not prove that its incorrectness will be solved by OPTION(FORCE ORDER) or other hints, and of cause it will not fix the issue, because the incorrectness is at functional level.
    The the functional mistake is a common one for beginners and I appreciate that you clarify it.
    One can argue that SQL has always an functional join order. The SQL optimizer has the freedom to rearrange the technical join order without breaking the functional join order. This technical join order can be frozen by the use of hints. So I get impression that different levels are mixed up here.
    Leave me the remark that Join Order hinting can be useful when SQL server has difficulties to make a good estimate of the cardinality, but that is an other topic.

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.