Window Functions and Cruel Defaults

SQL Server, T-SQL
5 Comments

My First Post Here…

Well, my first technical post, was about how the default index creation method is OFFLINE. If you want that sweet, sweet Enterpri$e Edition ONLINE goodness, you need to specify it. It’s been a while since that one; almost six months to the day. So here’s another one!

But Window Functions Are Awesome

Heck yeah they are. And how. Boy howdy. Etc. You get the point. I’m enthusiastic. What can be cruel about them? Glad you asked!

Window Functions, according to the almighty ANSI Standard, have two ways of framing data: RANGE and ROWS. Without getting into the implementation differences between the ANSI Standard and Microsoft’s versions, or any performance differences between the two, there’s a funny difference in how they handle aggregations when ordered by non-unique values. A simple example using the Stack Overflow database follows.

For the month of August, Year of Our Codd 2008, we’re getting a running total of the score for posts by UserId 1. Who is UserId 1? I’ll never tell. But back to the syntax! In the first SUM, we’re not specifying anything, for the next two we’re specifying RANGE and then ROWS. Why? REASONS! And why am I casting the CreateDate column as a date? MORE REASONS!

Before you scroll down, think for a second:

If I don’t specify RANGE or ROWS, which will SQL Server use?
If I left the CreateDate column as DATETIME, what eff aff difference would it make to the output?

Do you see a pattern forming here?

OH MY GOD IT WORKED
OH MY GOD IT WORKED

When we don’t specify RANGE or ROWS, well, SQL Server is nice enough to pick RANGE for us. “Nice”.

Whose fault? Default!
Whose fault? Default!

Deep breaths, Erik. Deep breaths.

You should also notice the difference in how each different method aggregates data. When the ordering column has duplicates, RANGE, and by extension, the default method, will SUM all the values for that group at once. When ROWS is specified as the framing method, you see the running total that most people are after.

Make project managers happy!
Make project managers happy!

And, of course, if all the values were unique, they’d do the same thing.

Back for a day
Back for a day

Wrap. It. Up.

This one is pretty self explanatory. If you’re lucky enough to be on SQL Server 2012 or greater, and you’re using Window Functions to their full T-SQL potential, it’s was easier to calculate running totals. Just be careful how you write your code.

If you like this sort of stuff, Check out Doug’s new video series, T-SQL Level Up. There are next to zero fart jokes in it.

Previous Post
Announcing Our 2016 In-Person Training Class Schedule
Next Post
Forgotten Maintenance – Cycling the SQL Server Error Log

5 Comments. Leave new

  • ” it’s was easier to calculate running totals”

    I’m pretty sure you wanted to say ” it’s way easier to calculate running totals”.

    Good article.

    Cheers.

    Reply
  • Aside from the change in behavior, there’s also a very serious performance impact when RANGE is used instead of ROWS.

    Worst.Default.Ever.

    Reply
  • Are there performance differences between RANGE and ROWS?

    I remember a training course where us was told that RANGE will use spills to tempdb and ROWS not.

    Following to statistics I/O RANGE produces several reads on worktable while ROWs has a worktable but shows me zero reads (don’t know if it means that there are really zero reads on worktable or if the statistics are lying).

    This is particularly true, if you
    CREATE INDEX idx_post_dummy ON dbo.Posts (OwnerUserId, CreationDate) INCLUDE (Score)

    Reply
    • Yes, that’s true. RANGE needs a worktable in tempdb, while ROWS can use (under certain conditions) a specialized in-memory table (hence the big performance improvement).

      That’s also why you see a worktable but with 0 reads I guess.

      Reply
  • First time I’ve read through an article on windows functions that I have actually understood the concept. Thanks and nicely done!

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.