Changes to auto update stats thresholds in SQL Server 2016

By Erik Darling · March 3, 2016 · 11 comments

TL;DR

As of CTP 3.3, it’s the same behavior as Trace Flag 2371, in 2008 R2 SP1 and onward. That basically means that the bigger your table is, the fewer rows need to be modified before an automatic statistics update occurs.

Slightly longer…

The change was announced over here. At first I thought, woah, cool, they thought about this and made big changes. But no, much like Trace Flags 1117 and 1118 being enabled for tempdb, it’s just…

Remember in Mortal Kombat, when getting to fight Reptile made you cool? Then Mortal Kombat 2 came out, and he was a playable character, and everyone would call your wins cheap if you picked him? That’s sort of what this reminds me of. If you’re new to SQL, you probably won’t appreciate the differences these Trace Flags make. If you’ve been using it for a while, you’ll probably start sentences with “back in my day, we had to add startup parameters…” and chuff off to write miserably long blog posts about unique indexes.

As of 02/23/2016, it sounds like Trace Flag 8048 is also enabled by default in 2016. See quote about soft NUMA at the link.

Moderate testing

I ran tests on some fairly large tables. I tried to run them on tables from 100 million to 1 billion rows, but I blew out the data drive of our AWS instance. So, uh, if you have a bigger server to test stuff out on, be my guest.

The basic concept was:

Load a bunch data into a table
Update it 1000 rows at a time (I know, I know, but updating less than that took FOREVER)
Run a query against it to invalidate stats
If they reset, add a bunch more data and start over

What I ended up with was, well…

Here’s an abridged version of 10-20 million and 30-40 million rows, and how many modifications they took before a stats update occurred. If you follow the PercentMod column down, the returns diminish a bit the higher up you get. I’m not saying that I’d prefer to wait for 20% + 500 rows to modify, by any stretch. My only point here is that there’s not a set percentage to point to.

And, because you’re probably wondering, turning on Trace Flag 2371 in 2016 doesn’t make any difference. Here’s what 10-100 million look like, in 10 million row chunks.

If you can guess which side TF 2371 was on for, I’ll give you one merlin dollhairs.

Great expectations

This improvement is certainly welcome as a default, though it’s not all that ‘new’. My 2014 instance comes up with the same thresholds with 2371 enabled. Unless you’re working with pretty big tables, or used to managing statistics updates on your own, you likely won’t even notice the change.

Thanks for reading!

Brent says: It’s kinda like Microsoft is treating trace flags as alpha/beta tests for new features now. That’s right in line with how .com startups use feature flags.

Free, 3× a week

Get my new posts by email

Three posts a week, plus a Monday roundup of the best database news from around the web.

11 comments

David Lafayette

March 3, 2016 at 11:33 am

I’ve been thinking a lot about TF 2371 lately and your article reminded me about it.

Let me set the mood:
A cold rainy day in the NW. SQL-2012, SP3. Many tables north of 10M rows. Scores of thousands added each day.

Question:
Is it a good idea to enable TF 2371 as a matter of course?

Reply
1. Brent Ozar
  
  March 4, 2016 at 10:56 pm
  
  David – typically no, we don’t recommend enabling trace flags by default. They’re non-default routes through the code path of the SQL Server engine, which means you’re stepping a little further away from the most tested configurations.
  
  Reply
@sql_handle

April 1, 2016 at 4:10 pm

My testing has shown virtually no benefit to trace flag 8048 in SQL Server 2016 – but not because of auto soft NUMA.
Auto soft NUMA only kicks in if more than 8 logical CPUs per NUMA node are detected. And soft NUMA only effects placement of incoming SQL Server connections. Trace flag 8048 changes scope of spinlock resources from NUMA node level to core level.
SQL Server 2016 contains another change that may remove the need for trace flags 8048, 1236, and 9024 where they’ve previously been used: auto-promotion of spinlock scope. When contention is detected on a global spinlock resource (like the resources addressed by trace flags 1236 and 9024), its promoted to node level. When contention is detected on a resource at node level it is promoted to CPU level. (Taking over what trace flag 8048 would have done). This enhancement in SQL Server 2016 doesn;t seem to be tied to core count at all – which is a good thing because even 4 or 6 cpus if busy enough can cause spinlock havoc.

Reply
Matthew McGiffen

June 7, 2017 at 11:01 am

Hey Erik. Just been looking at this related to a blog post of my own. Your figures are the best I’ve found for anyone trying look at what the new statistics recompilation thresholds are. I noticed there appears to be a pattern which falls into a mathematical formula quite nicely. If you divide the total number of rows by 1000, take the square root of the result, then multiply it back by 1000 again that gives you the threshold for that table size. e.g. 10,000,000 rows, divide by 1,000 gives you 10,000. Square root is 100, multiply again by 1,000 gives you 100,000 Not sure how it works for lower row counts, I might take a look. Cheers

Reply
1. Erik Darling
  
  June 7, 2017 at 12:47 pm
  
  Howdy! The graph here shows (I think) what you’re looking for.
  
  Reply
  1. Matthew McGiffen
    
    June 7, 2017 at 4:10 pm
    
    Thanks – might save me the effort of… insert one row, run query, insert one more row, run query, repeat until eyes square. Though the maths nerd/OCD in me may still need to quantify it into an (un)tidy equation!
    
    Reply
    1. Erik Darling
      
      June 8, 2017 at 9:12 am
      
      You know, I was going to do the one row insert thing, but it’s just too arduous, even if you automate it.
      
      Since you’re interested, here are the scripts I used to do all my testing (along with a semi-related XE session). Feel free to kick them around, if you want.
      
      Reply
2. Granger Godbold Student since 2017
  
  July 8, 2022 at 3:51 pm
  
  The pattern you found is actually MIN ( 500 + (0.20 * n), SQRT(1,000 * n) ) where “n” is the number of rows in the table.
  
  https://docs.microsoft.com/en-us/sql/relational-databases/statistics/statistics?view=sql-server-2017#auto_update_statistics-option
  
  Reply
Updating Statistics in SQL Server: Maintenance Questions & Answers - by Kendra Little

August 30, 2018 at 1:18 pm

[…] SQL Server 2016 automatically uses this improved algorithm. Woot! So if you’re using SQL Server 2016, you don’t need to decide. Erik Darling tested out the behavior in 2016 and wrote about it here. […]

Reply
Kalen Delaney

November 1, 2018 at 1:02 pm

Hi Brent … you said: “the bigger your table is, the fewer rows need to be modified before an automatic statistics update occurs.” That is not quite right. You still need more rows for bigger tables in order to trigger update stats. What is true, is that the bigger your table is, the smaller percentage of rows need to be modified before an automatic stats update occurs. 🙂

Reply
1. Erik Darling
  
  November 1, 2018 at 1:07 pm
  
  Thanks Helen!
  
  Reply