How to Think Like the SQL Server Engine: What’s a Key Lookup?

In our last couple of queries, we’ve been using a simple query to find the Ids of everyone who accessed the system since mid-2014:

But Ids alone aren’t all that useful – so let’s add a few more columns to our query:

Now think about how you’re going to execute this query plan in plain English, as a human being. You have two copies of the table: the nonclustered index (black pages) with LastAccessDate, Id:

And the clustered index (white pages) with all of the columns in the table:

dbo.Users clustered index

One way is to use the nonclustered index first.

We could:

  • Grab the black pages, seek to 2014-07-01, and start making a list of the LastAccessDates and Ids we want
  • Grab the white pages, and for each Id we found in the black index, look up their Id in the white pages to get the extra columns (DisplayName, Age) that aren’t included in the white index

Here’s what that query plan looks like:

We read plans from right to left, but also top to bottom. The first thing SQL Server did was the Index Seek at the top right – that’s the seek on the black pages. SQL Server seeked to a specific date/time, read out the LastAccessDate & Id for each row, and that was the end of the Index Seek’s work. (He’s a totally separate standalone mini-program.)

Then, SQL Server took that list of Ids, and looked up each one of them on the white pages – that’s the Key Lookup operation. SQL Server used their clustering key – in this case, Id, because that’s what we built our clustered index on. (If you don’t build a clustered index, this same basic process still happens – but more on that in another episode.)

That’s why every nonclustered index includes the clustering keys.

When I first told you I created the black index, I said I did this:

But I didn’t have to. I could have just done this:

And we would have ended up with the same basic black pages. SQL Server adds the clustering keys into every nonclustered index because it has to be able to accomplish these key lookups. For every row it finds in a nonclustered index, it has to be able to track back to that one unique row in the clustered index.

I just told you I created the index on both columns in the beginning because I had to ease you into this. I didn’t wanna say anything, but you weren’t particularly with it that first morning you started reading this series.

Yes, I’m still holding things back from you right now even as I type this. No, I didn’t want to say that because it’s only going to make things awkward between us. Also, you’re not my only reader. You’re still the one I love the most, though. Now let’s just move on before you ask more questions.

Key Lookup operators obfuscate a lot of details.

I wish execution plans were three-dimensional: I wish operators popped off the page based on the number of times they were executed. You see that “Key Lookup” and you think it only happened once, but that’s not true at all. Hover your mouse over it, and you’ll see Number of Executions – that’s how many times it was actually performed:

Earlier, I wrote that when Microsoft paints stuff on a screen, they either put it in alphabetical order, or they load up the Data Cannon, fire it, and wherever the data goes, that’s where it ends up. Here, this tooltip was arranged by the Data Cannon, so we’re going to jump around a little.

  • Estimated Number of Executions is based on how many rows SQL Server thought was going to come out of the Index Seek. For each row we found, we’re gonna have to execute one Key Lookup.
  • Estimated Number of Rows – how many rows EACH key lookup will return.
  • Number of Executions – how many times we actually did it, based on how many rows actually came out of the index seek.
  • Actual Number of Rows – the total number of rows that came out of ALL key lookups (not each one.)

I know. Sometimes they say Actual, sometimes they don’t. Sometimes the numbers are per-execution, sometimes they’re total. If you’re looking for consistency and organization, you don’t use the Data Cannon. I’m sure they treat the internals of SQL Server with much more accuracy and care.

The cost of the key lookup is logical reads.

Each time we execute the key lookup – in this case, 1,576 times – we’re opening up the white pages and doing some logical reads. To see it, let’s run a no-lookup query (that only gets Id) and a multi-column query (that gets DisplayName and Age too) and compare their logical reads:

The top query only does 7 reads, and the bottom one does 4,842! Those 1,576 key lookups needed to do about 3 logical reads each. (Why not just 1 read each? Because they need to navigate a tree structure of pages to find exactly the right page that has the data they want – there’s a couple/few pages of overhead each time you need to seek in an index. We’ll talk about alternatives to that later too.)

The more rows your index seek returns, the more likely it is that you won’t get this execution plan at all. Notice that in this post, I didn’t use 2014-07-01 as my filter date – I used something much more recent. You’ll learn why in the next episode.

Previous Post
How to Think Like the SQL Server Engine: So Index Seeks are Great, Right?
Next Post
How to Think Like the SQL Server Engine: What’s the Tipping Point?

5 Comments. Leave new

Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.

Menu
{"cart_token":"","hash":"","cart_data":""}