Long distance information
There was a rather interesting question posted on dba.stackexchange.com recently about CASE expression order of execution with an OR predicate. Mouthful, I know! When I saw it, I got all “I HAVE A REALLY GOOD ANSWER” and started to write demo queries.
Then I wrote this blog post instead. Sorry, j.r. — you see, I hit an actual execution plan bug that did not bode well.
And hey, what’s with those filters?
I thought those only showed up when we did something weird, like have a predicate on a MAX datatype, or a Window function result? Why are they here?
The answer is, of course, because SQL is lazy.
Let’s take a quick look at the demo query:
1 2 3 4 5 6 7 8 9 10 11 |
SELECT SUM(x) FROM ( SELECT CASE WHEN EXISTS (SELECT 1 FROM dbo.Posts AS p WHERE p.OwnerUserId = u.Id AND u.Id >= 1 AND u.Id <= 22656) OR EXISTS (SELECT 1 FROM dbo.Posts AS p WHERE p.OwnerUserId = u.Id AND u.Id >= 22657 AND u.Id <= 60000) THEN 1 ELSE 0 END AS [Pig] FROM dbo.Users AS u ) x (x) |
Both of the filters have the Startup Expression Predicate property. Here’s the first one.
What the heck does that mean?
Well, when everyone is telling you to read a query plan from right to left (even I do that, don’t be ridiculous), the order things logically occur in is from left to right. In this case, that filter predicate is only asking for rows from the Posts table where rows from the Users table meet the predicate.
If you don’t believe me, maybe you’ll believe STATISTICS IO.
For the first query we ran, the output looks like this, with the Posts table being scanned 23,672 times, resulting in 71,063 logical reads.
Table ‘Posts’. Scan count 23672, logical reads 71063, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table ‘Users’. Scan count 7, logical reads 6587, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table ‘Worktable’. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
If we change our query a little, that changes a lot. Here’s me vs. Jon Skeet, in a battle of the, well, in the battle of I-Lose-Really-Badly. That’s somewhere in France.
1 2 3 4 5 6 7 8 9 10 11 |
SELECT SUM(x) FROM ( SELECT CASE WHEN EXISTS (SELECT 1 FROM dbo.Posts AS p WHERE p.OwnerUserId = u.Id AND p.OwnerUserId = 22656) OR EXISTS (SELECT 1 FROM dbo.Posts AS p WHERE p.OwnerUserId = u.Id AND p.OwnerUserId = 3187747) THEN 1 ELSE 0 END AS [Pig] FROM dbo.Users AS u ) x (x) |
Here’s what STATS IO tells us: we did way less work. Posts was only scanned twice, and we only did 6 logical reads.
Table ‘Posts’. Scan count 2, logical reads 6, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table ‘Users’. Scan count 7, logical reads 6587, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table ‘Worktable’. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Return nothing, read nothing
Where this gets super duper extra cool, is when we run queries where no rows match at all. Check out these two suckbags.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 |
SELECT SUM(x) FROM ( SELECT CASE WHEN EXISTS (SELECT 1 FROM dbo.Posts AS p WHERE p.OwnerUserId = u.Id AND p.OwnerUserId >= 2147483647) OR EXISTS (SELECT 1 FROM dbo.Posts AS p WHERE p.OwnerUserId = u.Id AND p.OwnerUserId <= -2147483647) THEN 1 ELSE 0 END AS [Pig] FROM dbo.Users AS u ) x (x) SELECT SUM(x) FROM ( SELECT CASE WHEN EXISTS (SELECT 1 FROM dbo.Posts AS p WHERE p.OwnerUserId = u.Id AND p.OwnerUserId = 2147483647) OR EXISTS (SELECT 1 FROM dbo.Posts AS p WHERE p.OwnerUserId = u.Id AND p.OwnerUserId = -2147483647) THEN 1 ELSE 0 END AS [Pig] FROM dbo.Users AS u ) x (x) |
For these, STATS IO is the same. The Posts table never got touched, because no predicates passed the filter. No scans, no reads.
Table ‘Posts’. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table ‘Users’. Scan count 7, logical reads 6587, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table ‘Worktable’. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Filterless
When you’re working with query plans and see a filter, it could very well be there for a good reason. That good reason being, of course, that SQL is lazy.
Thanks for reading!
Brent says – wow, I’ve never seen that Startup Expression Predicate! I had no idea that was even a thing.
6 Comments. Leave new
Would you mind giving us the table definition or at least the DDL for the index on the posts table named ix_posts.
That should help my brain make some sense out of the information I am looking at here.
It was just on OwnerUserId!
Also, sorry, that was half an answer. If you want to see what the whole public Stack Overflow schema looks like, check here.
Hi, let’s assume we join table with MSTVF inside inline function with parameter that is the startup predicate for the whole join. It turns out that startup predicate is put sometimes before first operation of inner join and sometime before second. Do You know how to force position of Startup Predicate in query plan ?
Grzegorz — That’s a great question! You should post it on dba.stackexchange.com with the code and plan, etc. I just can’t do it justice in a blog comment.
Thanks!
Here is my post with example https://dba.stackexchange.com/questions/206191/how-to-control-position-of-startup-predicate-in-the-execution-plan