(Cross-posted from the PASS Blog.)
Years ago, my first go-to tool for performance tuning was Performance Monitor: it was available everywhere, worked in all versions, and gave me a lot of useful information without paying for a monitoring tool. Since then, I’ve moved on – wait stats are a much better starting point – but there are still 3 good Perfmon counters that I rely on.
No, not Page Life Expectancy, Buffer Cache Hit Ratio, and Disk Queue Length. Those are still hot garbage.
1. Forwarded Fetches per Second
When your tables don’t have clustered indexes, you’ve got a heap: a table organized in random order. Heaps are notoriously bad ideas if you do updates on variable-length fields because as SQL Server runs out of space on a page, it’ll move a row to another page where there’s available space for your new update. The more of these you have, the higher penalty you pay when scanning a table.
The Perfmon counter SQLServer:Access Methods: Forwarded Records/sec tracks how many of times SQL Server had to jump around while reading. (Think of these as extra logical reads to deliver query results.) The higher this number is, the more extra logical reads you’re doing. I don’t gauge the number by itself – I look at it as a percentage of the reads we’re doing overall. If I can cut the logical reads we’re doing by, say, 10% just by rebuilding the heap, great! That’s an easy win.
With a name like “Forwarded Records/sec” you would reasonably assume that querying this number would get you the, uh, forwarded records that occurred in any given second. Nope. It’s a cumulative number from when SQL Server last restarted. That means in order to get actionable data, you need to measure this over time, and do differentials between measurements.
2. Memory Grants Pending
When your query starts executing, it usually needs a memory grant to do things like sorts, joins, and parallelism. I find this topic really intriguing, and you can probably tell – we’ve got a whole category of memory grant posts.
SQL Server:Memory Manager: Memory Grants Pending tells you the number of queries that are waiting on a memory grant before they can even START executing. When this number > 0, your apps are probably getting query timeouts, and the sysadmins are confused because CPU time looks low, but SQL Server is slow.
3. Compilations per Second
When you execute your beautifully hand-crafted query, SQL Server has probably never seen anything like it before – well, at least not since it restarted – and has to build a query execution plan.
With a regular transactional app, the same queries probably come in over and over, just with different parameters. (Data warehouses and reporting systems with a lot of truly unique queries are different – we’ll set those aside for now.)
The SQLServer: SQL Statistics: Compilations/sec counter measures the number of times
per second (hahaha, you wish) that SQL Server has had to compile a plan for a query it hasn’t seen before. The best way to evaluate this is to frame it in terms of the SQLServer:SQL Statistics: Batch Requests/sec counter, which, as you can probably guess, is also not measured per second. Totally up to you to do a differential, or to use the open source sp_BlitzFirst to do the differentials for you.
In a transactional system, I expect to see 90% (or higher) query plan reuse – so Compilations/sec should be 10% (or less) of the Batch Request/sec measure.
These 3 Perfmon counters shouldn’t be the first place you look – wait stats are still a better starting point – but sooner or later, you’re going to hit one of these issues that require Perfmon counters to solve. Ol’ trusty Perfmon is still there for you.
Forwarded records per sec are indeed a per sec counter – depending on where you are looking. Sys.dm_os_performance_counters gives you what we call the raw values in perfmon language. These raw values need to be massaged to give meaningful info. A large percentage of the performance monitor counters in a machine share the same properties, we need to do calculations over them. Those who exposes perfmon counters (like SQL Server) assumes that we use a tool that indeed to this massaging of data (when you look at the documentation and explanation). Like indeed the Perfomance Monitor tool in Windows does. Do a query over a table with bunch of forwarded records and you see a spike for the counter, and then it drops down to zero again – certainly not an accumulated value. We need to be cautious when working with sys.dm_os_performance_counters since 52% of the values in there need some calculation in order to get the “refined” value (what Performance Monitor show us) instead of the raw value:
WITH i AS(
SELECT CASE WHEN cntr_type = 65792 THEN ‘Absolute_value’ ELSE ‘Need calculation’ END AS type_
SELECT type_, COUNT(*)
GROUP BY type_
Right, it’s up to the tool to sample and show these counters per second. SQL Server doesn’t – it’s up to us, the end users, to sample those, like I wrote in the post. Thanks!
If you read Forwarded Records directly from the Perfmon tool, you are looking at Forwarded Records/sec and not cumulative Forwarded Records since the start of the SQL Server service.
If you query SQL DMVs you are looking into cumulative Forwarded Records.
Also, thanks for the content 🙂
Right, being a SQL Server guy, you can probably guess which tool I use. 😀
[…] Comments | Link […]
Yeah I’d rather not rebuild heaps. Unless Microsoft have finally sorted the issue, doubtful, a rebuild creates new IAM pages every time you do that. Then at some point those IAM pages drop out of the cache (No surprise). Have you waited for 4,000 random reads even from an SSD to select the first record of a heap? Yeah it sucks. Microsoft have never loved heaps nor shown them any love. Even when Microsoft pretend to give heaps a little love with rebuilds the fuck it up. Oh and then there’s the other old issue I bet they still haven’t looked at that Stats IO lies when it reads forwarded records.
I love heaps and I really like SQL Server however I absolutely hate heaps in SQL Server. They’re a travesty for anything other than a staging table and even then there’s little benefit to them.
Nice article but let’s all just agree that row store on disk tables shouldn’t be heaps.
RUnwin – I’m with you, but be aware that there are folks who strongly disagree, like Marcus: https://use-the-index-luke.com/blog/2014-01/unreasonable-defaults-primary-key-clustering-key
I agree with him that heaps are good just not in SQL Server. The penalties of using heaps in SQL Server make them too painful. I’m also on board that clusters shouldn’t automatically be primary keys. Clusters should almost always be on an ascending key however I don’t like surrogate primary keys.
There is also a small downside to heaps that make them unreasonable for some tables. This being that you generally need to update more indexes when you update a heap when compared to a cluster. All swings and roundabouts really. Right tool for the right job and all 🙂
By the way I’m also down for Whisky and Yoga, where’s the meet-up?
Heh heh heh…
Hi Brent, why is page life expectancy on its own no longer a good indicator that either there’s not enough memory or queries are not cacheing pages efficiently? If PLE is constantly low during long periods, surely that’s a good indicator there’s memory problems?
Dan – nah, and it’s kinda beyond what I can teach in a blog post comment, but we spend time on it in the Mastering Server Tuning class.
Are these courses available for Europe/UK yet? thanks
The reason why probably has to do with NUMA?