How to Think Like the Engine: When a Seek Isn’t

Last Updated 4 years ago

In our last episode, I introduced the concept of scan predicates: execution plan operations that weren’t able to seek directly to the rows they needed. Let’s take another query:

DropIndexes;
GO
CREATE INDEX IX_LastAccessDate_Id_DisplayName_Age
  ON dbo.Users(LastAccessDate, Id, DisplayName, Age);
GO
SELECT Id
  FROM dbo.Users
  WHERE LastAccessDate > '1800/01/01'
    AND DisplayName = N'Brent Ozar';
GO

1

2

3

4

5

6

7

8

9

10

DropIndexes;

GO

CREATE INDEX IX_LastAccessDate_Id_DisplayName_Age

ON dbo.Users(LastAccessDate, Id, DisplayName, Age);

GO

SELECT Id

FROM dbo.Users

WHERE LastAccessDate > '1800/01/01'

AND DisplayName = N'Brent Ozar';

GO

If we ONLY have the gray pages index on LastAccessDate, Id, DisplayName, and Age, our query plan looks like this:

I’m going to narrate this from bottom up because it makes for easier storytelling:

Seek Predicates: SQL Server was able to seek on this index to a specific date/time. It jumped to the first user whose LastAccessDate = 1800/01/01. That’s, uh, not going to eliminate a lot of users because after all, Stack Overflow wasn’t around in the days of covered wagons. Now, SQL Server has “seeked” to that user – but now the real work begins.
Predicate: now SQL Server examines every row to check its DisplayName, looking for people with DisplayName = ‘Brent Ozar’.
Number of Rows Read: 299,398 – because we had to examine every single row in the table to find Brent.

You see an Index Seek here, but it is not a lightweight operation. We read every single stinkin’ row of the table.

“Then why does SQL Server call it a seek?”

When SQL Server builds execution plans, it’s going on all the information it knows at the time – which isn’t necessarily much, and can also be out of date. We talked about how SQL Server uses statistics to build plans, but statistics aren’t updated in real time.

Sure, the last time SQL Server updated statistics on this table, it knew that nobody had a LastAccessDate prior to 1800/01/01 – but it can’t guarantee that the data distribution hasn’t changed since the last time statistics were updated. Hell, it even gives you the ability to disable statistics updates, so the stats might be WAY out of date.

Granted, you know that based on the column name “LastAccessDate,” that nobody is going to access StackOverflow.com in the past. The data just doesn’t work that way. But SQL Server doesn’t know that unless you tell it – and that’s where things like constraints come in, telling SQL Server business rules about your data.

SQL Server has to just say, “Well, I know that I have an index that starts with LastAccessDate, and they’re looking for a specific LastAccessDate range. I can jump to that range and bypass any extra data that’s been added since the last time statistics were updated.”

Technically, this query plan works totally fine – there’s nothing wrong with it. But there sure is something misleading about that execution plan, isn’t there? The operator says “Index Seek”, which implies rows are being eliminated, and the “1 of 3” makes it sound lightweight too. That thick arrow is your only clue that there’s more work involved here than first meets the eye.

Oh, that thick arrow size? On actual plans, it would look like it relates to the data coming out of that execution plan operator. That’s not what it means at all – on actual plans, it refers to the amount of data that was read by that operator. The thicker the arrow, the more work is being done.

So when you see a thick arrow on an actual plan, look in the direction that the arrow is pointing away from.

(How is that for messed up? Even the arrow is drawing your attention away from the very problem. I tell you, sometimes I think Microsoft does this stuff just to give me job security.)

5 Comments. Leave new

Randy
November 21, 2019 9:14 am

Wow, Great tidbit. What other differences there are between the representation of Actual vs Estimated Plans?

Reply
- Brent Ozar
  November 21, 2019 9:23 am
  
  Randy – glad you liked it! We go into a lot more of that in my Fundamentals of Query Tuning and Mastering Query Tuning classes.
  
  Reply
Simon
November 25, 2019 6:19 am

“I tell you, sometimes I think Microsoft does this stuff just to give me job security”

I’ve lost count of the number of times I’ve thought this. See also: implicit conversions in SQL Server (and many other assorted things that can make you look like a wizard by doing very small code changes), greatly increasing the row and column limits per sheet in Excel 2007 (thus allowing an exponentially greater number of “solutions” to be created using wholly inappropriate tooling, maximising the incidence of businesses that have dug themselves into a hole from which adopting SQL Server is one of the escape routes)

Reply
Klaus
January 9, 2020 5:55 am

Brent, I just realized that SSMS 17.9.1 and 18.4 both show the two indexed columns as “Seek Predicates and the second column additionally as “Predicate” in execution plan.
Your screenshot shows the second index column just as (not seeked, i.e. scanned) “Predicate”. May this be an issue with used SSMS versions (older versions might just show the second indexed column as “Predicate”?)?
What SSMS version did you use?

Reply
- Brent Ozar
  January 9, 2020 11:16 am
  
  You may need to post a repro script with the exact index creations you’re doing – you may have a different index structure. It’s a little beyond what I can troubleshoot quickly in a blog post comment, and mine still looks the same way on 18.4.
  
  Reply