Up til now in the DBA Training Plan, we’ve been taking inventory of our servers, making sure the data’s well-protected, and understanding the basics of how the data’s stored in indexes inside those data files. Now, let’s start looking at performance, and let’s start by zooming really far out to think about how SQL Server runs a query.
When you pass in a query:
- SQL Server comes up with an execution plan for it
- SQL Server executes that plan
The first one happens in what seems like an instant, but in that instant, a whole lot of really complex decisions are made that will determine your entire experience.
It’s a lot like building a custom house.
When you decide to build a custom home from scratch:
- An architect designs a blueprint plan for it
- The construction company turns that blueprint plan into reality
The architect asks you a lot of questions about your style, how you like to live, and probably most importantly, how many people you expect to be living in this house. Capacity planning and honesty with your architect are both important here: if you tell them you plan to live by yourself and don’t entertain much, then you’re going to get a very different blueprint and cost than if you say you’ve got 8 kids and throw a lot of parties.
Once the architect hands off plans to the construction company, those plans are pretty well set in stone. You might be able to make small, superficial changes, but you’re not changing it from, say, 1 bedroom to 5 bedrooms. If you were single and had your architect design a 1-bedroom home, but then during construction, you decide to get married to someone who has 6 kids from a prior marriage, it’s not like the builder can move a couple of walls around and make everyone happy. You’re going to have to stop that project, then go back to the architect for a new plan.
There’s a strong division between the two phases: the architect’s work, and the construction company’s work. I might even say there’s a wall between them, but then this metaphor gets really complicated.
SQL Server works the same way.
(And by that, I mean all your construction projects run late and over budget. (Just kidding. (Not really.)))
Earlier, I used a really simple two-step process saying “SQL Server” comes up with an execution plan, and then executing it. I was wildly oversimplifying that – but now let’s dig a little deeper:
- The architect hears your list of requirements. Unlike our house-based analogy, where the architect can ask followup questions, he just has to sit here and listen to your demands, and hope that you describe your needs in a clear, accurate manner. T-SQL is a declarative language: you’re declaring the shape of the result set you want, but not necessarily describing how you want the exact result set to be built. (You can do that, though.)
- The architect thinks about whether he’s heard those requirements before. If someone’s come into his office and asked for something extremely similar, he may pull a blueprint out of his cache and hand that to you so the construction company can immediately start working.
- If he hasn’t, he’ll design a new
blueprintexecution plan. To get a rough idea of just how hard this work can be, scan Microsoft’s Query Processing Architecture Guide – and then realize that it’s an incredibly brief summary! The architect does all this work, and then saves the query’s blueprintexecution plan in cache (memory) so that he can reuse it if you happen to run the query again.
- The construction company takes the fresh or cached plan and starts work. This is the part you usually focus on, and you usually say things like, “Dang, it’s taking way too long to run this query.” Thing is, it’s not usually the construction company’s fault: they were just given a plan that wasn’t appropriate for the amount of data involved in the query. Maybe it was designed for a tiny amount of data but way too much came back, or maybe it was designed for a huge amount of data and did a ton of unnecessary prep work for a tiny amount of data.
When you’re reviewing the cost & runtime of a project, start by looking at the blueprint to see if it’s appropriate. This is where things usually went wrong.
Start by reviewing the blueprint for a query.
In SQL Server Management Studio, click Query, Include Actual Execution Plan, and run this query:
SELECT * FROM sys.databases;
That query gives you a list of databases on the server – but I don’t really care about the results. I’m more interested in the execution plan. Click on the Execution Plan tab of SSMS, and you get a graphical plan showing the work that SQL Server did. We’ll dig more into interpreting execution plans soon, but for now, right-click on the SELECT icon at the top left, and click Properties:
A new Properties pane will appear on the right side – look at the top of that, and there’s a ton of juicy details:
- Compile CPU = the number of milliseconds of CPU time spent building the query plan
- Compile Memory = well, I don’t think I’ve ever really used this for much
- Compile Time = the number of milliseconds overall spent building the plan (can be much higher than CPU if we have to wait for things to come back from storage, like the first time we compile a plan in a database after it’s been restored or started up)
- NonParallelPlanReason = if the query plan couldn’t go parallel and use multiple cores, you’ll get a hint here
- Reason for Early Termination of Statement Optimization = the architect may decide to go home and start drinking early
These metrics focus on the architect: the one building the query plan. When you think about it, he accomplishes a spectacular amount of work in a short period of time. It’s amazing that we get a working query plan at all that quickly!
This has interesting implications for you.
Plans are built without awareness of execution frequency. SQL Server doesn’t know whether you’re going to run this query just once, or a million times. It just builds a plan once and caches it.
Plan structure isn’t revisited even when it goes wrong. SQL Server doesn’t go back and say, “Whoa, performance on that one was terrible – I thought I was only going to bring back 1 row, but 1,000,000 rows came back, and I should do things differently.” Starting with SQL Server 2017, Microsoft does revisit some parts of the plan, like how much memory gets granted, but the plan shape stays the same. It’s up to you to figure out when the plan shape is incorrect, and coach SQL Server into designing a better plan.
You don’t get great knobs to control architect time. You can’t tell SQL Server, “Listen, I need you to spend a lot more time designing plans on this server because performance here is really important.” There’s a trace flag you can enable to force SQL Server to spend more time building EVERY query plan, but that’s a bad idea since it also includes things like IntelliSense queries and monitoring applications.
In theory, SQL Server will rewrite your query. You’ll come across blog posts that suggest it doesn’t matter how you write your query, but read those posts carefully: they come with caveats. The classic example is when SQL Server runs out of time during plan generation and decides to ship a plan – even though it knows there might be better plans available if it spent more time doing optimization.
In practice, human-readable queries perform better. If a human being can look at the query and get a pretty good idea of what’s going on quickly, then so can SQL Server. If a human being looks at the query and develops a look of confused terror, then SQL Server is probably not going to build a great execution plan either.
Plans stick around longer than you might expect. In a modern server with 64GB RAM or more, plans can stay in cache for days, weeks, and even months. If it was a good execution plan, then that’s a good thing. If it was a bad execution plan, well, that’s why people start doing index rebuilds in order to “fix” the problem. More on that in the next episode.
With this background in mind, next up, let’s review which query plans are in your server’s cache.