Here’s how Books Online describes the Max Degree of Parallelism setting:
You can use the max degree of parallelism option to limit the number of processors to use in parallel plan execution.
And here’s what the SQL Server 2019 setup screen says:
When an instance of SQL Server runs on a computer that has more than one CPU logical core, it detects the best degree of parallelism, that is, the number of processors employed to run a single statement, for each parallel plan execution. MAXDOP specifies the maximum number of cores to utilize for this instance.
Uh, no, SQL Server doesn’t really detect that.
In most cases, if SQL Server decides to parallelize your query, it’s goin’ straight to MAXDOP.
I’ll demo it with a 64-core server with SQL Server 2019’s defaults.
I’m using an i3.16xlarge AWS EC2 instance with 2 sockets, 64 cores altogether:
During setup, SQL Server suggests that we set MAXDOP to 8:
Which I find kinda amusing, since on an i3.metal with 72 cores (just 8 more cores), SQL Server 2019 setup suggests a MAXDOP of 15, but whatever.
I’ll add an index, then run a simple query against the Users table in a large Stack Overflow database:
1 2 3 4 5 6 7 8 9 |
/* Add an index to make our query easier: */ CREATE INDEX Location ON dbo.Users(Location); GO /* Run a tiny query with a key lookup: */ SELECT u.DisplayName, u.Id FROM dbo.Users u WHERE u.Location = N'San Diego, CA' ORDER BY u.DisplayName; |
The query just barely clears the default Cost Threshold for Parallelism (5), so it goes parallel:
Well, lemme rephrase that – the query looks like it goes parallel in the sense that there’s a parallelism operator, and icons on the plan have what Erik Darling calls racing stripes.
However, if you right-click on each operator, like the sort, and look at the number of pages assigned to each thread, it isn’t what you would call balanced. One thread read all of the data and did all of the work while the rest of the threads went out for a smoke break.
This is why your queries produce CXPACKET waits: SQL Server doesn’t do a great job of balancing work across cores, and when that work isn’t evenly balanced, SQL Server has to account for the time the idle cores aren’t doing any work.
If we raise our MAXDOP higher, the problem becomes worse. I’ll set MAXDOP up to 64, and then run the query again:
1 2 3 4 5 6 7 8 9 |
EXEC sp_configure 'show advanced', 1; RECONFIGURE EXEC sp_configure 'max degree', 64; RECONFIGURE GO SELECT u.DisplayName, u.Id FROM dbo.Users u WHERE u.Location = N'San Diego, CA' ORDER BY u.DisplayName; |
The execution plan looks identical:
But right-click on that Sort operator, and poor Thread 4 is the only thing holding this thing together:
And the query burned up 64 worker threads and generated 620 milliseconds of CXPACKET wait – in a query that finished in under 100ms:
This is why it’s so important to set Cost Threshold for Parallelism correctly.
It’s not enough just to set MAXDOP – because MAXDOP simply isn’t a maximum. It’s more like a minimum. If your query crosses over the CTFP, buckle up: you’re going way parallel, and even just MAXDOP 8 has CXPACKET issues you can’t fix.
There are indeed some cases where the degree of parallelism will be lower than MAXDOP – for example, when you try to set MAXDOP 0 on a 72-core server – but it’s nowhere near like what the documentation describes, which makes it sound like a query-by-query decision based on the size of your workload.
11 Comments. Leave new
Hello,
So what is equation or recommendation that be set Cost Threshold for Parallelism with MAXDOP value?
Thanks,
Chetan
Hi, Chetan. I cover that here: https://www.brentozar.com/archive/2013/08/what-is-the-cxpacket-wait-type-and-how-do-you-reduce-it/
How is the NUMA configured for the i3.16xlarge and i3.metal? The same MAXDOP documentation mentions defaults based on NUMA, maybe there’s a catch?
Bartosz – I talk about that in my Mastering Server Tuning class in the parallelism modules.
I have a server with 56 cores. Many queries go parallel and CXPACKET waits are frequent. Have tested with MAXDOP = 0, 8, or 4. Cost of parallelism = 50. Haven’t seen much difference thus stick with = 4 for now.
Pedro wrote some very useful information about MaxDoP back in July
https://techcommunity.microsoft.com/t5/sql-server/what-is-maxdop-controlling/ba-p/1505968
Came across this article (https://community.dynamics.com/crm/b/crminthefield/posts/dynamics-365-ce-on-premises—understanding-sql-server-parallelism) from Microsoft discussing MAXDOP in regards to CRM, but the thing that caught me attention was the query cost part where they say the default cost of 5 means 5 seconds
That’s so sad: even Microsoft employees get that completely and utterly wrong.
It seems DOP feedback feature (i.e SQL 2022) might help in adjusting it during subsequent executions.
https://www.erikdarlingdata.com/whats-the-point-of-dop-feedback-in-sql-server-2022/
We’ve got hope for that too! Have you seen it in action yet, or seen it demoed?
Speaking as a developer, we have ALL been Thread 4 before.