What are Page Faults?
If you’ve ever been monitoring a Linux process you might notice an odd metric: Page Faults. On the surface Page faults can sound scary. But in reality, a page fault is the natural process of requesting physical Memory to be allocated to different processes.
Linux divides up its physical Memory into “pages,” which are then mapped to virtual memory for individual processes. Each “page” equates to about 4kb of physical memory. And, just because the pages are mapped to a certain process, this doesn’t necessarily mean the process is actively using those pages. There can be both inactive and active pages in use by a process on the server.
So what is a page fault? Don’t worry, it doesn’t really have anything to do with errors. A page fault simply happens when a program is executing some code, and the executable code for that program isn’t currently being actively held in physical pages of Memory. Linux responds by allocating more pages to that program so that it can complete its execution of the code.
Minor Page Faults vs Major Page Faults
You may also see a shockingly high number of “minor page faults” in monitoring your process. Minor page faults are even less worrisome than major ones. This simply means a process requested data that’s already stored in Memory. The Memory just wasn’t allocated to the process requesting it. Linux will remap the currently used data to be shared between the two processes, without having to access the physical memory. The easiest way to remember the difference is: minor page faults can be served from existing virtual memory, while major page faults have to request more pages be allocated from the physical memory.
Page Faults and Swap
So what happens if your server has already mapped all its pages of the physical memory to different processes, but more Memory is needed to perform a task? The system will begin “writing to swap.” This is not a great scenario in any situation. The system will write some of the pages it’s holding onto in Memory to the disk. This frees some pages to satisfy incoming page fault requests.
Compared to using virtual Memory, writing pages to disk is super slow. And, writing to disk is an IO operation which can be throttled on many hosts if it happens too much. In regards to page faults, the process of using swap is the most worrisome. Using swap can easily become a downward spiral! Pages requested are written to disk, which overwrites other pages requested previously. But when the previous pages are requested again the cycle repeats itself again and again, until your system crashes.
With that in mind, the important aspects of your system to monitor are not necessarily the major or minor page faults, but rather the amount of swap being used.
Have you experienced issues with swap usage or page faults on your own system? Tell me about it in the comments.