A technical website with information, articles, tips and how-tos for IT system administrators concerning Microsoft Windows, Unix-like operating systems and Server.
Page Life Expectancy isn't what you think.. Updated 5/1. 1/2. ![]() There’s a lot of controversy about the Buffer Manager performance object counter Page Life Expectancy – mostly around people continuing to quote 3. That’s far too *low* to be the point at which to start worrying if your PLE dips and stays there. Jonathan came up with a better number to use – based on the size of your buffer pool – see the bottom of his post here. But that’s not why I’m writing today: I want to explain why most of the time now Page Life Expectancy is really not giving you useful information. Most new systems today use NUMA, and so the buffer pool is split up and managed per NUMA node, with each NUMA node getting it’s own lazy writer thread, managing it’s own buffer free list, and dealing with node- local memory allocations. Think of each of these as a mini buffer pool. The Buffer Manager: Page Life Expectancy counter is calculated by adding the PLE of each mini buffer pool and then calculating the mean. But it’s not the arithmetic mean as we’ve all thought forever, it’s the harmonic mean (see Wikipedia here), so the value is lower than the arithmetic mean. Thanks to Matt Slocum (b | t) for pointing out a discrepancy from the arithmetic mean on a large NUMA system and making me dig into this more, and my friend Bob Dorr from CSS for digging into the code.)What does this mean? It means that the overall PLE is not giving you a true sense of what is happening on your machine as one NUMA node could be under memory pressure but the *overall* PLE would only dip slightly. One of my friends who’s a Premier Field Engineer and MCM just had this situation today, which prompted this blog post. The conundrum was how can there be 1. PLE is relatively static – and this was the issue. For instance, for a machine with 8 NUMA nodes, with PLE of each being 4. PLE is 4. 00. 0. The calculation is: add the reciprocals of (1. PLE) for each node, divide that into the number of nodes and then divide by 1. In my example, this is 4 / (1/(1. Now, if one of them drops to 2. PLE only drops to: 4 / (1/(1. If you had an alert set watching for a 2. PLE then that wouldn’t fire, even though one of the buffer nodes was under high pressure. And you have to be careful not to overreact too. If one of them drops to 2. PLE only drops to: 4 / (1/(1.On NUMA machines, you need to be looking at the Buffer Node: Page Life Expectancy counters for all NUMA nodes otherwise you’re not getting an accurate view of buffer pool memory pressure and so could be missing or overreacting to performance issues.And adjust Jonathan’s threshold value according to the number of NUMA nodes you have. here. You can see the lazywriter activity for each NUMA node by looking for the lazywriter threads in sys.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |