I see a lot page faults on my system even though there appear to be plenty of Available Bytes. What is going on?

You are probably looking at the Page Faults/sec Counter and interpreting it as the rate of demand paging. Not entirely. In Windows Server, the Page Faults/sec counter includes both hard and soft page faults. (It also appears to include Cache faults/sec, which are application-related file cache read misses.) Instead of using the Page Faults/sec counter as an indicator of demand paging, you should rely on the Page Reads/sec Counter instead. Page Reads/sec corresponds to the rate of hard page faults that specifically require a disk access to resolve.

There are two types of so-called “soft” faults in Windows Server that are included in the Page Faults/sec metric: Transition Faults/sec and Demand Zero Faults/sec. High rates of both types of soft faults are common even where there is an ample supply of RAM.

Transition faults are a by-product of the Window Server page replacement policy and cannot easily be avoided. To determine which pages of a process address space are not currently in use, the Windows Memory Manager trims pages aggressively from each process working set. These pages are only stolen by the operating system provisionally, however. Recently trimmed pages remain in memory on the Standby List. (The Standby List is one of the components of the pool of Available Bytes.) The idea is that the next time that threads from the process execute, they will reference the pages that identify the application’s current working set, which the Memory Manager will restore swiftly. Recently trimmed paged that are re-referenced transition fault back into the process working set quickly because they normally remain in memory for some period of time after being trimmed. Because these transition faults are resolved by the OS without ever having to retrieve a page from the paging file on disk, they are not nearly so costly as a hard page fault that must be resolved from disk.

The mechanism for handling transition faults is straightforward. Pages on the Standby List remain in memory. Their corresponding Page Table Entries (PTEs) are marked invalid, but are also flagged in transition. A process thread accessing a memory location on an invalid page triggers a hardware interrupt. During interrupt processing, the operating system quickly determines that a page in transition was referenced. The PTE entry for page is then quickly restored, the page is returned to the process working set, and the instruction that failed due to an invalid address reference is re-executed. Trimmed pages that are not re-referenced in a timely manner remain on the Standby List until they are eventually moved to the Free List and, finally, the Zero List as they age.

Demand Zero faults are requests from executing processes for new pages. For security reasons, the operating system always returns an empty, zero-filled page when a process requests a new page of virtual memory. Demand zero faults are satisfied with empty pages from the Zero List, or from the Free List if the Zero List is empty. The Demand Zero Faults/sec Counter corresponds to the rate of requests by processes for new zero-filled pages. Since requests for new pages are satisfied directly from the pool of Available Bytes, Demand Zero Faults/sec are another category of soft faults that do not require disk I/O to resolve.

Consider a server process that allocates an area of virtual memory as a work area when requests for service arrive and later frees the memory when it has finished processing the request. Since freeing memory replenishes the pool of Available Bytes, this process might generate a high rate of Demand Zero Faults/sec without ever needing to access the paging file on disk.

Because soft faults do not generate I/O to disk, it is important to ignore the rate of soft faults when you are trying to figure out if you have enough RAM.

,

Comments are closed.