07. Windowing Functions – Performance (8m)
Windowing functions can be a very efficient way of taking aggregate level data and detail level data and placing them side by side in the same result set. What I’m going to demonstrate here in this example is … Say we just want to take a couple of sums and put them together alongside some detail level data about our player roles. In this instance, what I want to look at, are the day roll total that is the sum of all rolls for a given date and I also want to look at the layer day roll total. That would be the sum of all the sum of all the rolls for a given date, by a given player. I want to put that side by side with the detail level data from the player rolls table.
Now, if I wanted to do this without using Window functions, which I have to do here, then I have some choices. I get to use temp tables or self joins or common table expressions. I’m going to use common table expressions here. I’ve got a DRT table alias, a PDRT table alias, and then I’ve referenced those in my main query down below. One of the things I’m interested in with my query is the impact it has on both CPU and on storage on disk IO. The way that I’m going to observe that is to enter a command, set statistics IO and time on. What I’m telling SQL server is that I want to see the CPU time and I also want to see some disk IO information with every query that’s returned in this session.
Now, this is only going to take place in this particular window. If I flip over to another window, I’m going to have to do the same step statistics IO time on there for [00:02:00] that to give me information into that particular window. I’ll go ahead and run this and I’ll hit control R to hide the results because I don’t need to see those anymore and then I’ll execute this main query. I get my results back and then over on the messages tab I can look at what’s happened from a resource point of view. I see that my CPU time is 16 milliseconds. That’s the number I want to focus on here along with the table player rolls. I see that that had 52 logical reads and there was also a work table involved where SQL server’s query engine had to do some thinking, mash things together, do joins, and so forth, and that all happened in the work table, and that incurred 5,500 logical reads.
Now that I know that, I can look at those numbers and then compare them to the equivalent query using Windows functions. Let’s take a look at how that one would stack up.
Here we are in the Windowing function equivalent of what we just ran in the other query using common table expressions. This time, we’re using Windowing functions and we’ll get our day roll total by just asking for the sum and we’ll say partitioned by roll date, just like the group by in our CDE and then for the player day roll total, we’ll partition by roll date and player ID, just like we would do a group by in the other common table expression.
You can tell already this is a whole lot less typing, but how does it perform? In order to get that information, we’ll have to make sure we set stats IO and time on. I want a control R to hide the results and then we’ll go ahead and run the query. Go to the messages tab and look at that. It’s pretty much the same amount of CPU time, so as far as CPU’s concerned, it really [00:04:00] doesn’t matter. If we look at the logical reads, that’s the number we want to focus on, there are fewer logical reads against the player rolls table.
There are, however, more logical reads against the work table. One thing to consider here is that given the choice between logical reads against a physical table and logical reads against a work table, which is generated on the fly by the query engine, we’re probably better off incurring logical reads against a work table. The reason for that is the less time that we spend in the player rolls table, in this case, the less time that we are going to be holding out a lock on the table and perhaps locking other queries that want to come in and also read from the player rolls table.
The work table belongs just to this query and it’s not going to have any lock that would keep another query out. Something to keep in mind. Not only do we get something that performs in the same amount of CPU time here, but it’s a whole lot easier to read and honestly a whole lot easier to debug if you ask me.
Let’s look at how, if we were to take all of the Windowing functions that we’ve covered so far and throw them together in one query, that would perform against our CDE query.
Okay, we’ve got this massive query here. It’s got roll numbers. It’s got sums over different partitions, averages over different partitions. It does a lag. It does a read. It does first value, last value. It’s even going to get the sum of all the rolls up to this point and the sum of the last four rolls, all this great stuff. How is it going to perform?
Well, we’ve got to set stats IO time on, we’ll execute that, hide it, and here we go. Okay, before I even get to the performance, let’s look at how much information we got back in a blink. [00:06:00] Granted, it’s only a 10,000 roll table, but that came back very very fast, and we’ve got all this different information about our roll table. How did it do? Not bad. 94 milliseconds. That’s pretty fast, considering how much we just asked it for. Still, just 26 logical reads. We’re getting in and out of that table very quickly. Granted, logical reads on the work table went up. We kind of expect that, because we’re asking SQL server to do a whole lot of thinking and calculation going on here, so not too surprising, but again, we got this back in less than 100 milliseconds. That’s pretty sweet.
Bottom line here is that you can ask for a lot of Windowing function data and it’s not going to incur a huge expense. There may be times where Windows functions do not perform as well as common table expressions, but in my experience, the more I’ve tacked onto a query, the more Windowing functions I’ve built in, the more benefit I got, versus a query that would have to use common table expressions and again, a nice bonus, we spent a lot less time hitting physical tables and a lot more time in a work table where we’re out of everyone’s way.
- 01. Prologue and Script Files
- 02. Thinking in Sets (5m)
- 03. Number and Date Tables (10m)
- 04. Case Expressions (17m)
- 05. Computed Columns (10m)
- 06. Windowing Functions – Design (21m)
- 08. Replacing Cursors, Part 1 (6m)
- 09. Replacing Cursors, Part 2 (5m)
- 10. Replacing Cursors, Part 3 (7m)
- 11. Let the Adventure Begin! (9m)
- 12. About Doug (2m)