Sliding Scale Parallelism: Why Stop At DOP?

Last Updated July 7, 2017

This is a thought experiment

Just fair warning, you’re probably not going to learn anything new here. This also isn’t hidden in a CTP of SQL Server 2017, and as far as I know, it’s not a consideration outside of this post. Seriously, lawyers, I know nothing.

I was just thinking (while sober, mind you), in a what-if scenario, parallelism settings weren’t so two dimensional. When you hit X you get X. I know, that’s vastly oversimplified, and there’s all sorts of stuff to figure out DOP and parallel query placement. That’s fine.

But let’s say the knobs you had to turn were

MAXDOP — duh
Cost Threshold for Parallelism — duh again
Starting DOP — The DOP a query would get when it breaks CTFP
DOP increment — The number of cores to add per CTFP increment
CTFP increment — The cost increment at which more DOP is assigned

I know, it sounds weird, but bear with me.

What if you set CTFP to 50, and you’re cool with a query that costs 50 Query Bucks going parallel, but you don’t want it chewing up DOP 8.

Granted, these costs are estimates, and the estimates are wonky as all get-out sometimes.

But we’re already basing the decision to go parallel on wonky costing. Why not have more control over it?

Why not say, at CTFP 80 you can have 4 DOP, at 120 you can have 6 DOP, and for anything over 200 you can have 8 DOP?

It’s certainly an interesting conversation, especially with SQL Server licensing.

What’s licensing got to do with it?

When you opt in to Enterprise Edition, you pay $7k per core. Most people out there do the logical thing, and buy the smallest number of the fastest cores they can to support their workload.

When you start running AGs, or bringing on users, thread count becomes more of a consideration.

You don’t get infinite threads. When you run out of them, you run into THREADPOOL. This is not the idyllic swimming pool at a Club Med. This is the pool you get stuck in right before ending up in a PSA for gasoline huffing.

Looking at the chart for Max Worker Threads: at 32 cores (that’s about 225k in licensing bucks), you only get 960 threads. That means you can run 120 simultaneous parallel operations. Not 120 simultaneous queries, 120 simultaneous operations. Remember, a parallel query can have mulitple branches, and each of those branches will get DOP threads. So if you have a query that does four joins, it could get DOP 8 * 4 threads. That’s 32 threads.

One way of controlling thread consumption is tuning parallelism settings. This would be a couple additional knobs to turn for folks running into issues without sacrificing performance for more costly queries, and without having to go through code and add in MAXDOP hints. After all, if you run into a parameter sniffing issue in a stored procedure, the cost you get for one query could be terribly, wildly different from another query. Those MAXDOP hints could come back to haunt you.

This opens up a more complicated can of worms: is DOP influencing cost estimation now a recompile reason?

It used to be.

Stay Adaptive, Pony Boy

With SQL Server’s new Adaptive Join technology, this could be another consideration along with join type. Rows are a part of cost estimation, and if we’re deciding join types on the fly based on row thresholds, certainly we could apply additional DOP logic as well.

Thanks for reading!

P.S. Oracle doesn’t have this, so Microsoft could totally one up them here to make up for the fact that they don’t have a cool floating super fast yacht.

What I Love About the Stack Overflow Database

[Video] Office Hours 2017/07/12 (With Transcriptions)

11 Comments. Leave new

Steve Smith
July 14, 2017 8:36 am

What would be cool is if the optimizer could work out roughly how many cores would give the best performance/thread/concurrency ratio and automagically change the number of cores used for you.

If it could determine, based on normal system usage, that the server is in a quiet period, it might choose to use all cores, even if the performance saving is minimal. Whereas at other times it would use less cores so there’s higher availability for everyone else.

While we’re in this crazy utopia world, it would also automatically place a beer order for you just before your server crashes

Reply
- Erik Darling
  July 14, 2017 8:48 am
  
  Shouldn’t it just stop your server from crashing so you don’t have to put down your beer?
  
  Reply
  - Steve Smith
    July 14, 2017 10:06 am
    
    Now you’re really in crazy land
    
    Reply
David W
July 14, 2017 11:21 am

Maybe a future feature for Resource Governor so different workloads can be told to behave differently?
i.e. reporting Users/connections can have this sliding scale .. but not more than 6 cores.
data import connections can use this alternate sliding scale.
They added IO to resource governor so you never know 🙂

Reply
- Erik Darling
  July 14, 2017 11:36 am
  
  Perhaps, but RG has such a low adoption rate, plus it’s Enterprise only. At first glance, it’s not available in Standard for 2016 SP1, either. Ideally this would be accessible and transparent to end users, with minimal administration*.
  
  *=You know, while we’re dreaming.
  
  Reply
Matthew Giles
July 16, 2017 5:14 pm

I like the Thought Experiment; it certainly has some good points of merit to it.
The only drawback is that vendors of code, many of whom are lazy when it comes to writing decent & efficient code, would write worse code than they do now and use the settings to compensate.
On the upside, it will take tuning more into the realm of the DBA where it belongs.

Reply
- Erik Darling
  July 16, 2017 7:05 pm
  
  I think you’re generally right, but what I tend to see from vendors is either complete unawareness of, or hyper-awareness of parallelism. Sometimes I like to think things like this shock people into reevaluating their code or best practices, but, you know… What I like to think doesn’t always align with reality. Perhaps it’s to blame on early childhood gin consumption.
  
  Reply
Yann Louchart
July 17, 2017 3:05 am

Quite an interesting thought experiment indeed. Well, Microsoft was bragging about SQL 2017 having some sort of AI embedded, with optimiser anti-regression and the lovely adaptive joins to name a couple… maybe they already have something better than “all or nothing” in terms of parallelism up their sleeve.
SQL Server is already doing quirky things with MaxDOP though: I had MaxDOP set to 1 (as recommended for Dynamics AX configuration) and got a poor execution plan for a certain query. At MaxDOP >1, SQL Server found a much better plan… which wasn’t parallel. I guess the former plan would scale up OK-ish in a non-parallel universe (!), whereas the latter could go parallel happily, making it good in both conditions (non-parallel and parallel). Kind of weird.
If you make the optimizer factor in more parameters, like a throttle in the number of cores to enrol in the query, one may need some “supervisor” process to check the optimizer never goes crazy and makes a very bad choice. It would make performance tuning “interesting” otherwise 😉

Reply
Fabien Sabinet
July 17, 2017 5:20 am

I have made a little tool SQLProcexp (for “SQL Process Explorer” on the same principle as that of procexp from sysinternals for Windows) to monitor in real time SQL Server and you can adjust the ‘cost threshold for parallelism’ to see in real time the result on parallel execution and CXPacket wait. It’s here (free of course) :
https://fabiensabinet.wordpress.com/2017/06/21/sql-procexp-en/

Reply
Richard
July 20, 2017 5:03 pm

Having only 4 cores at work, this was an interesting post to learn from, thank you. I also got a free tool too, Liking SQL Process Explorer very much.

Reply
Wilfred van Dijk
August 9, 2017 12:41 pm

I love the idea of increasing MaxDOP when the complexity increases, a said it’s now all or nothing (actually: nothing or all)

Reply