Free Live Webcast: The First SQL Server Webcast You Should Ever Attend.
When you pass in a query, how does SQL Server build the results?
It sounds really simple, but we all take so much for granted.
When you pass in a query, how does SQL Server build the results?
It sounds really simple, but we all take so much for granted.
You’ve heard of my free How to Think Like the Engine class, and maybe you even started watching it, but…it has slides, and you hate slides.
Wanna see me do the whole thing in Management Studio, starting with an empty query window and writing the whole thing out from scratch live? This session is for you.
In our last episode, I introduced the concept of scan predicates: execution plan operations that weren't able to seek directly to the rows they needed. Let's take another query:
[crayon-6a4717e18c473446128847/]
If we ONLY have the gray pages index on LastAccessDate, Id, DisplayName, and Age, our query plan looks like this:
I'm going to narrate this from bottom up because it makes for easier storytelling:
We've been working with the clustered index of the Users table, which is on the Identity column - starts at 1 and goes up to a bajillion:
And in a recent episode, we added a wider nonclustered index on LastAccessDate, Id, DisplayName, and Age:
[crayon-6a4717e195c2a146467468/]
Whose leaf pages look like this:
In our last cliffhanger episode, I said that if we ran this query: [crayon-6a4717e195e15889477831/] And we had this index: [crayon-6a4717e195e1b569103938/] Then we would have to do all of these things: Look up user #643 on the clustered index by doing a seek - because thankfully our kind developer included the clustered primary key in the…
In our last episode, in between crab rangoons, I had you create either one of these two indexes:
[crayon-6a4717e196031971574403/]
And I said that the leaf pages of either index would look the same:
In terms of the space they take up on the leaf pages, it doesn't matter whether columns are in the keys of an index or in the includes. It's the same amount of space on the leaf pages. To see it, run sp_BlitzIndex focused on that table:
In our last episode, we were running into problems with these two queries: [crayon-6a4717e196568167056640/] When SQL Server saw the function in the second query's WHERE clause, it punted out with a 1-row estimate, which caused us problems. SQL Server did an index seek + key lookup where it wasn't appropriate. Well, if we're not allowed…
In our last episode, we saw how SQL Server estimates row count using statistics. Let's write two slightly different versions of our query - this time, only looking for a single day's worth of users - and see how its estimations go: [crayon-6a4717e19674a050001154/] Both of those queries are theoretically identical in that they accomplish the…
In our last episode, SQL Server was picking between index seeks and table scans, dancing along the tipping point to figure out which one would be more efficient for a query.
One of my favorite things about SQL Server is the sheer number of things it has to consider when building a query plan. It has to think about:
In our last episode, I'd expanded our query to include DisplayName and Age - two columns that weren't in our nonclustered index: [crayon-6a4717e196d0b411340605/] So as a result, I was getting key lookups in the execution plan: And I spent a lot of time talking about the overhead that each key lookup incurs. Astute readers among…
In our last couple of queries, we've been using a simple query to find the Ids of everyone who accessed the system since mid-2014:
[crayon-6a4717e196f27699198937/]
But Ids alone aren't all that useful - so let's add a few more columns to our query:
[crayon-6a4717e196f2c452961098/]
Now think about how you're going to execute this query plan in plain English, as a human being. You have two copies of the table: the nonclustered index (black pages) with LastAccessDate, Id:
In our last episode, we added a nonclustered index, and now even though this query has a filter and an ORDER BY, it hardly did any work at all - just a lightweight index seek:
Ah, index seeks - nothing better than that, right?
When we left off in the last post, our users kept running this query, and they want it to be really fast:
[crayon-6a4717e197457207130904/]
Let's pre-bake the data by creating a copy of the table sorted in a way that we can find the right rows faster:
[crayon-6a4717e19745c221360778/]
This builds a separate copy of our table (also stored in 8KB pages) that looks like this:
Earlier in this series, we ran a query with ORDER BY, and we realized that it was CPU-intensive work that tripled the cost of the query: Now, let's run that query a bunch of times. In SSMS, you can add numbers after the GO, and SSMS will run your query a number of times. I'll…
In our last post, we ran a query with an ORDER BY, but we only got one column in the SELECT:
[crayon-6a4717e197a4f377576673/]
The estimated cost was about $18 Query Bucks because SQL Server had to:
Scan the entire clustered index, yelling out the Id and LastAccessDate of each row
Sort that list by LastAccessDate
We started out the How to Think Like the Engine series with a simple query with a WHERE clause:
Now let's add an ORDER BY:
[crayon-6a4717e197c3b715817855/]
Here's the updated plan - note that the query cost has tripled to $17.72 Query Bucks. Let's dig into why:
In the first post in my How to Think Like the Engine series, I started by doing a pretty simple query:
But I noted that if you were following along on your computer, you might see a different execution plan. If I tweak just one thing about my SQL Server, I see a different plan:
You’re a developer or a DBA, and you’re comfortable writing queries to get the data you need. You’re much less comfortable trying to design the right indexes for your database server.
In the next several posts, you’ll learn: