Chapter 21: Kernel Heap
- Why the kernel needs a heap allocator in addition to a page allocator
- The difference between page-level and slab-level allocation
- How to implement a simple slab allocator
- How to implement kmalloc/kfree used by kernel subsystems
- Memory pooling and cache-aware allocation
- Our kernel heap implementation
21.1 Why a Kernel Heap?
The page allocator (Chapter 15) allocates whole pages (4 KB). Many kernel objects are much smaller: a PCB (Chapter 13) might be a few hundred bytes, a file descriptor maybe 16 bytes, a mutex (Chapter 26) 8 bytes. Allocating a full 4 KB page for each tiny object would waste huge amounts of memory.
A kernel heap allocator provides fine-grained allocation (like malloc/free in userspace) within the kernel. It acquires pages from the page allocator, subdivides them into smaller objects, and tracks which objects are free.
21.2 Slab Allocator
A slab allocator is the most common kernel heap strategy (used by Linux, FreeBSD). It works by:
- Creating caches for each object type (e.g., a cache for PCBs, a cache for files, etc.)
- Each cache allocates slabs (groups of pages from the page allocator)
- Each slab is divided into equal-sized objects (one per allocation)
- Free objects are tracked in a linked list within the slab
/* Slab cache structure */
struct slab_cache {
const char *name;
size_t obj_size; /* Size of each object */
size_t obj_per_slab; /* Objects per slab page */
struct slab *partial; /* Slabs with some free objects */
struct slab *full; /* Slabs with no free objects */
struct slab *empty; /* Slabs with all objects free (can be freed) */
};
struct slab {
struct slab *next;
struct slab_cache *cache;
void *free_list; /* Linked list of free objects */
int inuse; /* Number of allocated objects */
int total; /* Total objects in this slab */
};
/* Initialize a slab cache */
void slab_cache_init(struct slab_cache *cache, const char *name, size_t obj_size) {
cache->name = name;
cache->obj_size = (obj_size + 7) & ~7; /* Align to 8 bytes */
cache->partial = NULL;
cache->full = NULL;
cache->empty = NULL;
/* Calculate objects per slab: we need space for slab header + objects */
size_t slab_header = sizeof(struct slab);
size_t usable = PAGE_SIZE - slab_header;
cache->obj_per_slab = usable / cache->obj_size;
}
/* Allocate from a slab cache */
void *slab_alloc(struct slab_cache *cache) {
struct slab *slab = cache->partial;
if (!slab) {
/* Need a new slab */
slab = page_alloc();
slab->cache = cache;
slab->inuse = 0;
slab->total = cache->obj_per_slab;
/* Build free list inside the slab data area */
char *data = (char *)slab + sizeof(struct slab);
slab->free_list = data;
for (int i = 0; i < cache->obj_per_slab - 1; i++) {
void **ptr = (void **)(data + i * cache->obj_size);
*ptr = data + (i + 1) * cache->obj_size;
}
void **last = (void **)(data + (cache->obj_per_slab - 1) * cache->obj_size);
*last = NULL;
slab->next = cache->partial;
cache->partial = slab;
}
/* Pop from free list */
void *obj = slab->free_list;
slab->free_list = *(void **)obj;
slab->inuse++;
if (slab->inuse == slab->total) {
/* Move slab from partial to full */
/* (remove from partial list, add to full list) */
}
return obj;
}
21.3 kmalloc/kfree
Our kernel provides generic kmalloc/kfree functions. kmalloc uses a set of pre-allocated slab caches with sizes that are powers of 2 (16, 32, 64, 128, 256, 512, 1024, 2048 bytes).
/* Predefined kmalloc caches (powers of 2) */
#define KMALLOC_CACHE_COUNT 8
static struct slab_cache kmalloc_caches[KMALLOC_CACHE_COUNT];
static size_t kmalloc_sizes[] = { 16, 32, 64, 128, 256, 512, 1024, 2048 };
/* Initialize kmalloc caches */
void kmalloc_init(void) {
for (int i = 0; i < KMALLOC_CACHE_COUNT; i++) {
char name[16];
snprintf(name, sizeof(name), "kmalloc-%zu", kmalloc_sizes[i]);
slab_cache_init(&kmalloc_caches[i], name, kmalloc_sizes[i]);
}
}
/* Allocate memory in kernel space */
void *kmalloc(size_t size) {
for (int i = 0; i < KMALLOC_CACHE_COUNT; i++) {
if (size <= kmalloc_sizes[i]) {
return slab_alloc(&kmalloc_caches[i]);
}
}
/* For allocations > 2 KB, fall back to page allocator */
if (size <= PAGE_SIZE) return page_alloc();
return page_alloc_multiple((size + PAGE_SIZE - 1) / PAGE_SIZE);
}
/* Free memory in kernel space */
void kfree(void *ptr, size_t size) {
if (!ptr) return;
/* Determine which cache this came from and free to it */
for (int i = 0; i < KMALLOC_CACHE_COUNT; i++) {
if (size <= kmalloc_sizes[i]) {
slab_free(&kmalloc_caches[i], ptr);
return;
}
}
/* Large allocations: return to page allocator */
page_free(ptr);
}
21.4 Memory Pools
A memory pool is a specialized allocator for objects of a single fixed size. Pools are faster than general-purpose slab allocators because they don't need to search for the right cache. Our kernel uses pools for frequently-allocated structures:
/* Pre-allocated pools for common kernel objects */
static struct slab_cache *pcb_cache;
static struct slab_cache *file_cache;
static struct slab_cache *msg_cache;
static struct slab_cache *vm_region_cache;
void init_kernel_pools(void) {
pcb_cache = kmalloc_caches + 1; /* PCB is ~128 bytes */
file_cache = kmalloc_caches + 0; /* file struct is ~32 bytes */
msg_cache = kmalloc_caches + 1;
vm_region_cache = kmalloc_caches + 2; /* ~64 bytes */
}
struct pcb *alloc_pcb(void) {
return slab_alloc(pcb_cache);
}
void free_pcb(struct pcb *pcb) {
slab_free(pcb_cache, pcb);
}
21.5 Heap Debugging
Heap bugs (use-after-free, double free, memory leaks) are the most common kernel bugs. Our implementation includes debugging features:
- Red zones: add a guard pattern around each allocation, checked on free
- Poison values: fill freed memory with 0xDEADBEEF to catch use-after-free
- Magic numbers: embed a signature in the slab header to detect corruption
- Leak tracking: count allocations and frees, report unfreed objects on shutdown
21.6 Our Implementation
Our kernel heap is initialized in kernel_heap_init(), called after the page allocator and MMU are ready. The heap:
- Creates 8 kmalloc caches for sizes 16-2048 bytes
- Creates specialized pools for PCB, files, VM regions, etc.
- Allocates slabs from the page allocator as needed
- Supports kmalloc/kfree for variable-size requests
- Returns large allocations (>2 KB) directly from the page allocator
- Includes red-zone and poison-value debugging in development builds
The heap uses spinlocks (Chapter 27) for SMP safety. Each slab cache has its own lock to reduce contention. The page allocator is also locked, so slab allocation involves two lock acquisitions: one for the slab cache, one for the page allocator (when a new slab is needed).
21.7 Exercises
Exercise 1: Slab Layout
For a slab cache with 48-byte objects, how many fit in a single slab (4 KB page)? Show the calculation including the slab header.
Exercise 2: Memory Leak Detector
Extend the slab allocator to track all live allocations. Add a function kmalloc_report() that prints the number of bytes allocated per cache.
21.8 Summary
The kernel heap provides fine-grained memory allocation below the page level. A slab allocator divides pages into equal-sized objects organized in caches. Our kmalloc/kfree implement a general-purpose interface backed by power-of-2 slab caches. Specialized memory pools offer fast allocation for frequently-used structures. Debugging features like red zones and poison values help catch common heap bugs during development.