How Should We Show Statistics Histograms in sp_BlitzIndex?

Last Updated January 4, 2019

If you’re a graduate of my free How to Think Like the SQL Server Engine course – and you’d better be, dear reader – then you’re vaguely familiar with DBCC SHOW_STATISTICS. It’s a command that shows you the contents of a statistics histogram.

When I’m doing the first two parts of the D.E.A.T.H. Method – (D)eduplicating identical indexes and (E)liminating unused indexes – I often wanna check the selectivity of the first field in the index. However, running DBCC SHOW_STATISTICS was painful because it’s not set-based – I couldn’t easily dump out the contents of lots of indexes at once.

That’s where sys.dm_db_stats_histogram comes in really handy. I’ve added a new section to the bottom of sp_BlitzIndex to show its contents when you’re examining a single table with the @TableName parameter:

sp_BlitzIndex's new statistics histograms at the bottom — sp_BlitzIndex’s new statistics histograms at the bottom

In that screenshot, I’m looking at the indexes on the Stack Overflow Users table, in particular, an index on DisplayName. The bottom section shows the histogram for that index, and if you keep scrolling down – in the real output, not the screenshot, silly – you can see the histograms for the rest of the indexes.

Now, this is a brand spankin’ new feature – it’s only in the dev branch of the First Responder Kit, and sys.dm_db_stats_histogram is only available on SQL Server 2016 SP1 CU2, 2017, and newer.

Right now, the stats on user-created indexes show up first, followed by the user-created stats, followed by the auto-created stats, all in alphabetical order. This feature also only works when you’re looking at a specific table with the @TableName parameter – it doesn’t really make sense to dump out all histograms across an entire database.

My question for you, dear reader, is: are there any changes you would want to this feature before I roll it out as part of the main branch? If you want to show other data, don’t just give me the wish list – give me the problem you’re trying to solve with the additional data, because there might be a better way to render it than what I’m showing.

You can leave a comment with any changes you’d like to see, or, if you’d like to edit the code and contribute changes yourself, check out the contribution guide.

Geoff Patterson
January 10, 2019 7:23 am

The histogram result set in your screenshot looks great and more user-friendly than the proc we created internally.

We’ve also found it quite helpful to use multiple result sets so that we can provide a quick overview of overall properties of the statistics (one row per statistic), density (one row per statistic column), and histogram (one row per histogram step, as in your screenshot).

In many cases where we are debugging a poor query plan, we’ve found that we do not need to look at the histogram at all because the aggregate MIN_RANGE_KEY / MAX_RANGE_KEY / sample_percent columns — or the density by leading tuple within a statistic — provide enough info to quickly form and test a hypothesis without having to get bogged down in hundreds of rows of histogram data.

Here’s a screenshot of the output our internal procedure for this produces on a StackExchange Users table, with a few sections we find particularly helpful highlighted in red:

https://imgur.com/a/D2ZVPOp

This might be a different direction than you want to go — especially because we are using some messy code like a loop for each statistic with code like “INSERT … EXEC(DBCC SHOW_STATISTICS() WITH NO_INFOMSGS, DENSITY_VECTOR” — but just wanted to share what’s been helpful for us in case it’s helpful for your tools in the future!

Reply
- Brent Ozar
  January 10, 2019 7:25 am
  
  Ooo, neat! That would make perfect sense to show on the top row of sp_BlitzIndex on a per-index basis, only running it for the index stats, but you’re right – the insert/exec is going to be a little messy. I’m intrigued though!
  
  Reply
Jim Parker
January 10, 2019 9:39 am

It is a great feature addition. Just a heads-up there is an issue with with the new dmv documented below. I have run into this during troubleshooting.

https://feedback.azure.com/forums/908035-sql-server/suggestions/34061962-dm-db-stats-histogram-missing-row-when-column-hist

“When statistics on a nullable column contain a NULL sys.dm_db_stats_histogram will not return a row for the corresponding histogram step (when RANGE_HI_KEY is a NULL).”

Reply
- Brent Ozar
  January 10, 2019 9:40 am
  
  Ooo, that’s great! Nice find.
  
  Reply