Chapter 17: Paging
- What paging is and why it is used
- How virtual addresses are translated to physical through page tables
- The multi-level page table structure on ARM64
- Page sizes: 4 KB, 16 KB, 64 KB, and large pages (2 MB, 1 GB)
- How to create and manage page table entries
- How paging enables demand paging, copy-on-write, and memory-mapped files
17.1 What is Paging?
Paging is a memory management scheme that divides memory into fixed-size blocks called pages (virtual) and frames (physical). The kernel maintains a mapping from virtual pages to physical frames. When a program accesses a virtual address, the MMU translates it to the corresponding physical address using the page tables.
Paging is different from segmentation, which divides memory into variable-sized segments. Paging has no external fragmentation: all pages are the same size, so the allocator does not need to search for a contiguous block of the right size.
17.2 Page Sizes on ARM64
ARM64 supports multiple page sizes, configured via the TCR_EL1 register:
| Page Size | TCR TGx Field | Translation Levels |
|---|---|---|
| 4 KB | 00 | 4 levels (L0-L3) |
| 16 KB | 10 | 4 levels (L0-L3), but L0 is unused for 48-bit VA |
| 64 KB | 01 | 3 levels (L1-L3) |
Our kernel uses 4 KB pages, which is the most common choice and matches Linux. With 4 KB pages and 48-bit virtual addresses, the translation uses 4 levels.
17.3 Page Table Entry Format
Each page table entry (PTE) is 8 bytes. The format for a L3 (page) entry:
/* Page table entry format (ARM64) */
#define PTE_VALID (1 << 0) /* Entry is valid */
#define PTE_TABLE (1 << 1) /* Entry points to next level table */
#define PTE_PAGE (1 << 1) /* Same bit, means "block" at L2, "page" at L3 */
#define PTE_ATTR_INDEX (3 << 2) /* Memory attributes index (MAIR) */
#define PTE_NS (1 << 5) /* Non-secure */
#define PTE_AP_RW (1 << 6) /* Access permission: read/write */
#define PTE_AP_EL0 (1 << 7) /* Access permission: accessible from EL0 */
#define PTE_SH (3 << 8) /* Shareability domain */
#define PTE_AF (1 << 10) /* Access flag */
#define PTE_NG (1 << 11) /* Not global */
#define PTE_ADDR (0x0000FFFFFFFFF000ULL) /* Physical address (bits 47:12) */
/* Creating a page table entry */
uint64_t make_pte(uint64_t phys_addr, int writable, int user_accessible) {
uint64_t pte = phys_addr & PTE_ADDR;
pte |= PTE_VALID | PTE_PAGE | PTE_AF | PTE_ATTR_INDEX(0);
if (writable) pte |= PTE_AP_RW;
if (user_accessible) pte |= PTE_AP_EL0;
return pte;
}
17.4 Multi-Level Translation Walk
With 4 KB pages and 48-bit addresses, a virtual address is split into 9-bit indices for each level, plus a 12-bit page offset:
VA [47:39] -> L0 index (512 entries)
VA [38:30] -> L1 index (512 entries)
VA [29:21] -> L2 index (512 entries)
VA [20:12] -> L3 index (512 entries)
VA [11:0] -> Page offset (4 KB)
The translation walk:
/* Walk page tables manually (for debugging) */
uint64_t va_to_pa(uint64_t *ttbr0, uint64_t va) {
int levels[] = { 39, 30, 21, 12 };
uint64_t *table = ttbr0;
for (int i = 0; i < 4; i++) {
int idx = (va >> levels[i]) & 0x1FF;
uint64_t pte = table[idx];
if (!(pte & PTE_VALID)) return 0; /* not mapped */
if (i == 3 || (pte & PTE_PAGE)) {
/* Page or block entry: extract physical address */
return (pte & PTE_ADDR) | (va & 0xFFF);
}
/* Table entry: walk to next level */
table = (uint64_t *)(pte & PTE_ADDR);
}
return 0;
}
17.5 Demand Paging and Page Faults
When a program accesses a virtual address that has no valid page table entry, the MMU triggers a page fault (data abort exception). The kernel's fault handler can:
- Allocate a page: if the address is within the program's valid memory region but no physical page is mapped yet
- Load from disk: if the page has been swapped out
- Copy-on-write: if the page is shared and being written to
- Signal SIGSEGV: if the address is invalid (access violation)
/* Page fault handler (called from data abort exception handler) */
void handle_page_fault(uint64_t fault_addr, uint64_t esr) {
/* fault_addr comes from FAR_EL1 (Fault Address Register) */
/* esr tells us the type of fault */
uint64_t ec = (esr >> 26) & 0x3F;
int write_fault = (esr >> 6) & 1; /* for data aborts, bit 6 = write */
/* Find the process's memory region that contains this address */
struct vm_region *region = find_vma(current_process, fault_addr);
if (!region) {
/* Segmentation fault: no region mapped */
process_exit(current_process, SIGSEGV);
return;
}
/* Allocate a physical page and map it */
void *page = page_alloc();
if (!page) {
process_exit(current_process, SIGKILL);
return;
}
/* Map the page at the fault address */
uint64_t va_aligned = fault_addr & ~0xFFF;
map_page(current_process->page_table, va_aligned, (uint64_t)page,
region->writable, region->user);
}
17.6 Our Implementation
In our kernel, paging is the foundation of the memory management subsystem. The components work together:
- Physical page allocator (Chapter 15): provides free physical pages
- Page table manager: creates and modifies page tables
- Memory mapper: maps virtual addresses to physical pages
- Page fault handler: responds to faults by allocating pages on demand
- Per-process address space: each process has its own page table
The map_page function creates or updates a page table entry at the correct level, allocating intermediate tables as needed:
void map_page(uint64_t *ttbr0, uint64_t va, uint64_t pa, int writable, int user) {
uint64_t pte = make_pte(pa, writable, user);
int levels[] = { 39, 30, 21, 12 };
uint64_t *table = ttbr0;
for (int i = 0; i < 4; i++) {
int idx = (va >> levels[i]) & 0x1FF;
if (i == 3) {
table[idx] = pte; /* final level: set page entry */
return;
}
if (!(table[idx] & PTE_VALID)) {
/* Allocate a new page table */
uint64_t *new_table = page_alloc();
table[idx] = ((uint64_t)new_table & PTE_ADDR) | PTE_VALID | PTE_TABLE;
}
table = (uint64_t *)(table[idx] & PTE_ADDR);
}
}
17.7 Exercises
Exercise 1: VA Decomposition
Write a function that prints the L0, L1, L2, L3 indices and page offset for a given virtual address. Test with addresses 0x40000000 and 0xFFFF000040000000.
Exercise 2: Page Table Walker
Implement a function that walks the page tables for a given VA and prints the PTE at each level.
17.8 Summary
Paging divides virtual memory into fixed-size pages that are mapped to physical frames through multi-level page tables. On ARM64 with 4 KB pages, the translation uses 4 levels indexed by 9-bit portions of the virtual address. Page faults occur when a translation is missing; the handler can allocate pages on demand, implement copy-on-write, or terminate the process. Paging is the mechanism that makes virtual memory work, providing isolation, flexibility, and efficient use of physical memory.