What Is Skewed Parallelism?
When queries go parallel, some assumptions get made:
- There’s enough work to keep multiple threads busy
- Each thread will get an equal amount of work to do
The ‘equal amount of work’ part is particularly important, because in a parallel plan, each thread gets an equal share of memory up front.
If threads end up doing uneven amounts of work, query performance may suffer — especially where spills are involved.
What Does Skewed Parallelism Look Like?
Unfortunately, it’s only in actual plans right now. As far as I can tell, there’s no Extended Event that will monitor for it, either.
If you check the Properties of an operator, and look under the Actual Rows node, you can see how many rows ended up on each thread.
In this particular plan, all the rows ended up on one thread, and a Sort ended up spilling.
The warning on the Sort says this:
All three million-ish rows ended up on one thread, which ended up spilling due to insufficient memory.
The memory being split evenly here came back to bite us. If rows had ended up evenly distributed, we probably would have been okay, or spilled significantly less.
What Causes Skewed Parallelism?
Most often, a flaw in the way the parallel page supplier assigns pages to threads.
But there are some operations, like Eager Index Spools, that aren’t parallel aware.
If you see one in an actual plan, you can bet that the index access immediately prior to it will have all rows on a single thread.
How Do You Fix It?
Sometimes you can change DOP, other times you need to dig deeper and change the query in a way that will force rows through a Distribute or Redistribute Streams operator.
If this is the kind of stuff that interests you, you should totally come to our precon at PASS Summit this year, Performance Tuning in 21 Demos. We’re going to be demoing interesting behavior in SQL Server 2017, 2019, and even Azure SQL DB, showing you how query plans can go wrong.
Thanks for reading!