Page Life Expectancy Doesn’t Mean Jack, and You Should Stop Looking At It.

By Brent Ozar · June 17, 2020 · 28 comments

Page Life Expectancy is a Perfmon counter that’s supposed to track how long pages stay in memory before they get flushed out to make room for other pages that are needed instead.

Paul Randal has blogged this a few times, and rather than rehash ’em, I’d rather point you to a couple of his roundups:

And layer in a few of my own observations for context:

Page Life Expectancy goes up 1 for each second that you’re not under memory pressure. Restart your SQL Server instance and watch PLE: it starts at 1, and goes up by 1 for each second of uptime. 5 minutes of uptime = PLE 300. For the first 5 minutes, it’s not like your server’s under memory pressure – it just woke up, for crying out loud. Give him a minute. Well, 15-20, I guess, because PLE is near useless during that time span.

Page Life Expectancy drops can be triggered by confusing operations. By default, any one running query can get a memory grant the size of 25% of your buffer pool. Run a few of those queries at the same time, and your buffer pool gets drained – but PLE doesn’t necessarily drop. However, the instant an unrelated query runs and needs to get data that isn’t cached in RAM, your PLE will drop catastrophically. Which queries are at fault? The queries getting large grants, or the queries doing reads? “Yes.”

Page Life Expectancy is a lagging indicator. Lagging indicators are something that tell you about an emergency long after the emergency has happened, and lagging indicators don’t recover quickly after the emergency is over. When you combine the above two problems – PLE only rising 1 per second, and PLE dropping at times that aren’t necessarily tied to the dropping buffer pool – then if you’re alerting based on low PLE numbers, you could have already missed the emergency. When you go log in and look for what long-running queries are running, it’s already too late. Instead, you should be using leading indicators: things that tell you a problem is coming up.

With that stuff in mind, I’ve removed the Page Life Expectancy warnings from sp_BlitzFirst altogether. I think they’re distracting y’all from the real leading indicator, wait stats – and of course, sp_BlitzFirst shows that, so I’d rather focus your attention there.

Free, 3× a week

Get my new posts by email

Three posts a week, plus a Monday roundup of the best database news from around the web.

28 comments

Recce

June 17, 2020 at 6:18 am

Say it isn’t so!!!!!
Next thing you’ll say is I shouldn’t be looking at disk queue length and dividing it by the number of spindles in my RAID-5 array

Reply
Recce

June 17, 2020 at 6:45 am

I do wish “best practice” came with a BBE date.

Reply
Grzegorz ?yp

June 17, 2020 at 8:08 am

What should be used instead PLE to see memory pressure problem?

Reply
1. Brent Ozar
  
  June 17, 2020 at 8:08 am
  
  Wait stats, as I explain in the post.
  
  Reply
Roman

June 17, 2020 at 9:15 am

“Page Life Expectancy Low” Finding is removed? Please no.

I have inherited a lot of old apps and procedures from former sql admins which I will never understand. So when I find a “Page Life Expectancy Low” issue I thorow RAM at the server.

Reply
1. Brent Ozar
  
  June 17, 2020 at 9:15 am
  
  Roman – instead, look for high PAGEIOLATCH or RESOURCE_SEMAPHORE waits, and throw memory at it in those situations. It’ll be way more cost-effective.
  
  Reply
  1. Roman
    
    June 17, 2020 at 11:34 am
    
    Thanks, for the hint. I get both to together right now. But I have never took a look at PAGEIOLATCH as the priority is much lower
    Priority FindingsGroup Finding
    50 Server Performance Page Life Expectancy Low
    200 Wait Stats PAGEIOLATCH_SH
    
    Reply
    1. Brent Ozar
      
      June 17, 2020 at 11:37 am
      
      Yep, that’s exactly why I’m changing it: people were focusing too much on the wrong metric. Glad to see the change is having the desired effect.
      
      Reply
  2. John Bougeois
    
    March 12, 2024 at 3:43 pm
    
    I had a situation where I was experiencing with high RESOURCE_SEMAPHORE waits. The issue was Indexing. I had been reorganizing indexes and when I changed it to index rebuild online, it solved the issue. Reorganizing just was no longer doing the job. But be careful with Index rebuild, especially in sql server standard. It can cause table locking and blocking.
    
    Reply
2. Jeffrey Mergler Student since 2017
  
  June 17, 2020 at 9:38 am
  
  @Roman, you can use Tim Radney’s script if you feel the need to query PLE: http://timradney.com/2013/03/08/what-is-page-life-expectancy-ple-in-sql-server/
  
  That said, I will now start honing in on waitstats as brent mentions above.
  
  Reply
Jeffrey Mergler Student since 2017

June 17, 2020 at 9:35 am

No complaints here, I didn’t even notice PLE was in BlitzFirst and there are plenty of scripts which give you PLE.

Reply
Vince Thomas

June 17, 2020 at 11:01 am

I agree. When we switched over to very fast SSD drives a couple years ago, I stopped worrying about PLE.

Reply
Martin

June 17, 2020 at 10:04 pm

I remember, back in the day, it was all “buffer cache hit ratio is great”, then “no, BCHR is garbage – use PLE – its great”. Now “PLE is garbage use waitstats”. Am I starting to sound old??? ?

Reply
Dave Potts

June 18, 2020 at 4:27 am

I take the point about PLE being a lagging indicator but so are wait stats aren’t they?

We have PLE checks but they look at the direction of travel not the values. If it is going up or flat with ‘high enough’ value we don’t generate alerts. If it is flat with a low value or falling then we need to take a look.

Reply
1. Brent Ozar
  
  June 18, 2020 at 5:43 am
  
  Nah, like I mention in the post, you can hit RESOURCE_SEMAPHORE waits long before PLE starts to drop.
  
  Reply
Filip

June 19, 2020 at 6:44 am

The good guys of Solarwinds should read this. 🙂

Reply
1. Brent Ozar
  
  June 19, 2020 at 7:57 am
  
  Well, don’t blame monitoring tools: when I worked for Quest, we had that same discussion, but the product management team said, “We can’t remove that metric. Too many DBAs *THINK* it’s a useful metric, and if we don’t show it, they’ll think our tool sucks.” So I get why the vendors have to put it on there.
  
  Reply
  1. Filip
    
    June 19, 2020 at 7:59 am
    
    Haha yeah i don’t ? Still love Solarwinds!
    
    Reply
  2. John Zabroski
    
    December 5, 2022 at 11:47 pm
    
    I like Page Life Expectancy over long periods of time to understand the general load patterns on a server. Agree, it is a lagging indicator, but lagging indicators are often good “acausal metrics”. Acausality is a weird topic for most engineers to wrap their heads around, but it’s how airplanes remain stable in flight – wings of the plane get input from a sensor at the nose cone of the plane on possible lift conditions, so the wings know milliseconds before it happens what the wind conditions will be.
    
    In the same way, if you’re coming in as a consultant and trying to understand load patterns, page life expectancy is a really cheap way to learn something about a server that has no prior diagnostic data available – assuming they didn’t just recycle the box.
    
    With cloud databases becoming more common, eventually it will fade away completely, since you will just have all the useful metrics available to you.
    
    And, back in the 2000s when Trust In The Rust spinning disks were still common, page life expectancy was somewhat useful for a different reason, but back then I preferred to just watch query cache metric in Windows PerfMon along with CPU activity and see if my query was parallelizing perfectly and how much of the cache was hit or missed.
    
    Reply
Filip

June 19, 2020 at 7:59 am

Haha yeah i don’t 🙂 Still love Solarwinds!

Reply
Jon

June 24, 2020 at 6:38 pm

Heyo Brent, ironically timed article because I was just asking about this on StackExchange two weeks ago. One of the first things I looked into we’re wait stats for any indication of memory pressure. I noticed my top 2 wait stats were: “MEMORY_ALLOCATION_EXT” and “RESERVED_MEMORY_ALLOCATION_EXT”. Are these any indication of memory pressure or I should be looking for other specific wait stats? (These two were my highest wait types by an order of magnitude more than the next highest.) Thanks.

Reply
1. Brent Ozar
  
  June 24, 2020 at 6:44 pm
  
  Hi! For personal help with your server, click Consulting at the top of the site. Thanks!
  
  Reply
WillemHenk Student since 2021

October 22, 2020 at 10:55 pm

I just came across this post from someone who recommended looking at it. 😛 [deleted] Kinda old, but it pops up on Google right away. 🙂

Reply
1. Brent Ozar
  
  October 23, 2020 at 3:47 am
  
  Thanks for the heads up! Deleted that.
  
  Reply
RasmusAlsen

December 22, 2022 at 10:37 am

I must break down and confess I use this to indicate that I have done a good job.

Some 3 years ago the Page life expectancy on our SQL Server were always between 6.000 and 12.000.
We then installed some more memory and it went up between 40.000 and 50.000

I then had the BI and .Net team start using a redundant database for their work.
Their last queries on the production database were performed 2 days ago and the PLE has been climbing since.
It is now showing 141.000 and we have never had a more stable production system.

Am I wrong for using this as one (of many) indicators that things are better now than 3 years ago.

Reply
1. Brent Ozar
  
  December 24, 2022 at 3:32 pm
  
  Time to head over to my Mastering Server Tuning class.
  
  Reply
B Stevens

December 13, 2024 at 9:18 pm

This is complete BS and you have no idea what you’re talking about Brent.

I’ve used PLE for decades, with complete success at what I’m doing. Sure there some minor qualifications (like every other damned thing in SQL Server), but your claim that it “doesn’t mean jack” and “you should stop looking at it” is false and utter BS.

Hope that helps!

Reply
1. Brent Ozar
  
  December 14, 2024 at 12:01 am
  
  You sound like a professional, especially given that your comment was submitted with a made-up email address. I’m sure your blog post with evidence will be coming along Real Soon Now™.
  
  In the meantime, keep up the “good” work, and I look forward to continuing to inherit your clients and fixing things up. Cheers!
  
  Reply

Page Life Expectancy Doesn’t Mean Jack, and You Should Stop Looking At It.

Get my new posts by email

Keep digging

28 comments

Leave a comment Cancel reply