I think I’ve found a bug in SQL Server setup’s MAXDOP calculation, and I need you to take a second look. Setup is recommending MAXDOP 8:
Which is odd, because this is running on an AWS i3.16xlarge with 2 sockets, 2 NUMA nodes, 32 logical processors per node, 64 logical processors altogether. In this screenshot, I’ve laid Task Manager alongside setup so you can see what I mean:
If you click on the “Configure the max degree of parallelism” link in setup, it says:
In this case, the SQL Server has multiple NUMA nodes, with greater than 16 logical processors per node – that’s the last line of the screenshot. In that line, Microsoft says MAXDOP should be half of the number of logical processors with a max of 16 – so 16.
But it’s recommending 8. Hmm.
Is Soft-NUMA the culprit? No.
After SQL Server finishes installation and starts up, the error log shows automatic soft-NUMA kicked in because NUMA nodes had more than 8 physical cores:
So it created 4 NUMA nodes, each with 16 logical processors. In that case, maybe the setup calculation was taking that configuration into account – 4 nodes, each with 16 cores. In that case, let’s revisit the guidance:
Now, the number in play is the 3rd line of the screenshot – “Less than or equal to 16 logical processors per NUMA node.” In that case, we’re supposed to keep MAXDOP at or below the # of logical processors per NUMA node.
Now, technically 8 is below 16 – but where the heck is 8 coming from? Why not, uh, 16, or 4, or 12, or for that matter, 2?
I’m guessing I’m missing something obvious, but I asked on Twitter, and nobody figured it out. Worst case, either the documentation is wrong, or setup is wrong, there’s some kind of other recommendation that isn’t shown in setup – like maybe there’s a hidden max of MAXDOP 8 during setup?
So, what’d I miss?
Update 1: one of the smartest SQL Server folks I know, Joe Obbish, writes that he thinks it may not be documented anywhere. It’d be cool if we did get it documented, though – whatever logic is good enough for setup should be good enough for KB 2806535, so users can make that same decision as their VM sizes grow.
Mystery Solved: Yep, There’s a Mismatch.
Microsoft employee Sean Gallardy answered that setup uses a different formula than the documentation recommends:
Step 1: Calculate Hardware NUMA and Soft NUMA
Step 2: Decide whether Hardware or Soft NUMA will be used
Step 3: Divide the total logical processors by the NUMA used
Step 4: If > 15 LPs/NUMA, MAXDop = (LPs/NUMA)/2, otherwise MAXDop = LPs/NUMA
Which means that if you have:
- 14 logical processors per NUMA node: MAXDOP will be set to 14
- 16 logical processors per NUMA node: MAXDOP will be set to 8
<sigh> That doesn’t make any sense, but it is what it is. It’s probably always been this way since SQL Server 2016 “fixed” MAXDOP by setting it during setup.