SQL Server’s Cost Threshold for Parallelism

“Should a query get to use more than one CPU core?” That’s an important question for your SQL Server. If you’re not sure what parallelism is, get started by exploring the mysteries of CXPACKET with Brent. He’ll introduce you to the the setting, ‘Cost Threshold for Parallelism’.

Let’s test Cost Threshold for Parallelism

I generate an estimated execution plan for the following query. I’m running against a copy of the StackOverflow database that doesn’t have many indexes.

I get a parallel plan with an estimated cost of 272.29. (I can tell it’s parallel because of the yellow circle with double arrows on three of the operators.)

execution plan parallel

I decide I’m going to test out ‘Cost Threshold for Parallelism’ and make this plan go serial. This is a server wide configuration, but I’m on my test instance, so no worries.

I run my query again and look at my actual plan this time…

Actual plan- paralleism

Hey wait, that’s still parallel! It has the same estimated cost, and that cost is below where I set the cost threshold for parallelism. This seems broken.

At this point, I might get confused and think SQL Server was using a cached plan. But changing configuration options like cost threshold for parallelism will trigger recompilations– that shouldn’t be it.

What ‘Cost’ is SQL Server Using?

The secret is in the serial plan. I need to look at the estimated cost for this query — it’s the same as our original, but I’ve added a MAXDOP 1 hint to force a serial plan:

Estimated cost - serial plan

The estimated cost for the serial version of the plan is 287.963, which is over the threshold I set at 275! This is the cost that is being considered and which is above the bar I set for who gets to go parallel. I can prove it by raising my cost threshold to just above this level:

And now when I run my query (with no maxdop hint to force it), I get a serial plan.

serial plan because of cost threshold

Cost Threshold Doesn’t Just Apply to the Plan You See

Behind the scenes, the optimizer is considering many possible plans. The Cost Threshold for Parallelism setting comes into play during that process and uses costs that you don’t see in the final plan.

This can be tricky to remember. Special thanks to Jeremiah, who’s explained this to me in the past (more than once!) This is covered in SQL Server Books Online, but oddly enough most of us don’t seem to find it there.

Previous Post
When Does Index Fragmentation Matter?
Next Post
Reading the New Fast Track Reference Architectures from HP & EMC

20 Comments. Leave new

  • Just a quick aside.

    It’s common to hear (or even say) that MAXDOP limits the number of processors that can be used and that is not correct.

    I say it all the time to clients because they don’t have to know the technical minutia. But we love the nuances.

    MAXDOP *only* controls the number of parallel threads PER OPERATOR during query execution

    MAXDOP does NOT limit the TOTAL number of threads that can be used during processing of the query.

    • Ooo, so good news, I have a different post on this exact topic coming up soon! Foreshadowing.

      Maxdop does control the core count for the query overall. But you’re totally right that cores is not the same thing as threads– a query limited to 2 cores can use many more worker threads. In the upcoming post I show how you can actually see that for an example query, so you won’t have to take my word for it.

      • Sweet. Thanks Kendra.

        A sql internals guy sent me this when I was trying to underdsand all the sundry nuances.

        “a key reminder about MAXDOP is that it does NOT limit the TOTAL number of threads that can be used during processing of the query in question.”

        • And then there’s this:

          “Degree of parallelism determines the maximum number of CPUs that are being used; it does not mean the number of threads that are being used.”

          msdn.microsoft.com/…/ms178065(v=sql.100).aspx

          So, it’s nice to see Microsoft is also confused on this one. 🙂

          • I think both statements are technically correct. Maxdop controls the number of cpus/processor cores used by the query. And it is for the query as well as for each operator.

            Operators can use more worker threads than cores. Worker threads are not the same as a CPU.

            I think they’re trying to say that a lower maxdop will result in fewer threads than a higher maxdop, but the number of threads per operator isn’t necessarily equal to maxdop. (The number of cores used by the query is equal to maxdop.)

            I show a lot more detail on this in my upcoming post. This post doesn’t even try to cover that — it’s just talking about what qualifies for parallelism.

            Edit: tried to clarify. words are hard.

          • Thanks again Kendra.

            I’ll be looking forward to your next post on the issue.

            Mike

      • You’d would think I would be smart enough not to argue with Kendra. Again, Kendra’s correct on this one and I was wrong. Oh, the pain in admitting I was wrong!! 🙂

        I blogged about it here: http://linkd.in/1ro0I5C

        Thanks… Mike

      • Kendra,

        Where can I find your follow up article related to MAXDOP?

  • “but oddly enough most of us don’t seem to find it there.” <– this describes my life with BOL. 🙂

  • Awesome article!

  • Vijay Govindan
    November 20, 2014 3:47 pm

    This is a perfectly time article. We just talked to Microsoft for some accidental performance tuning, and they mentioned lowering MAXDOP. When I asked them about Cost Threshold for Parallelism (found Brent’s old article!) they say the recommended value is the default 5, but there is another school of thought on increasing the setting. Jeremiah had an article that he recommended setting it to 50, but test it for the different environments to see what was optimum.

    I was lucky enough to see MAXDOP = 0 versus MAXDOP = 4 on our dev box. Kendra is absolutely correct. MAXDOP = 0 runs wide open and executes 16 threads. MAXDOP = 4 executes 4 threads. The performance difference on my sample query only reduced CPU time a small amount. Most of the cost was in the I/O.

  • Thanks for the article

  • This is a great post about a subject that has confused me for some time. Thank you.

  • Kendra,

    I have been tracking stored procs subtreecost that are using parallelism and see something different from what you are showing. I have the cost threshold set to 20 on an SQL2012 SP3 build 6523 server. I am trying to see if we should raise this higher but I see parallelism for queries with less than 0.08 subtreecost. Even when I force them to run as serial the cost is low. Looking at the plan I can see the optimizer had an early termination reason of Timeout so this must also throw the optimizer out.

  • Nice info so cost threshold value depends on maxdop value? kindly clarify on it

Menu
{"cart_token":"","hash":"","cart_data":""}