ARM64 OS Handbook
🔍

Chapter 23: Processes

What You Will Learn in This Chapter
  • What a process is and how it differs from a program
  • Process creation via fork/exec
  • Process termination and reaping
  • Process hierarchy (parent-child relationships)
  • Copy-on-write optimization for fork
  • Our implementation of process management

23.1 What is a Process?

A process is an instance of a running program. It consists of:

  • An address space (page tables, mappings for code, data, stack, heap)
  • At least one thread of execution (register state, stack)
  • OS resources: file descriptors, signal handlers, working directory
  • A process ID (PID) that uniquely identifies it
  • A parent process that created it

A program is a static file on disk (an ELF binary, Chapter 37). A process is the dynamic execution of that program in memory.

23.2 Process Control Block

The kernel represents each process with a Process Control Block (PCB) (introduced in Chapter 13):

/* Process Control Block (PCB) */
struct pcb {
    /* Identification */
    pid_t pid;
    pid_t parent_pid;
    char name[32];

    /* Execution state */
    enum process_state state;
    struct context context;      /* Saved registers (callee-saved) */
    uint64_t *page_table;        /* TTBR0 value for this process */
    uint64_t stack_top;          /* Top of kernel stack */
    uint64_t stack_bottom;       /* Bottom of kernel stack */

    /* Scheduling */
    int priority;
    int ticks_remaining;
    struct pcb *next_ready;      /* Ready queue linked list */

    /* Resources */
    struct file_descriptor *fds; /* File descriptor table */
    uint16_t asid;               /* Address Space ID */

    /* Process tree */
    struct pcb *parent;
    struct pcb *children;
    struct pcb *sibling;
    int exit_code;
};

23.3 Fork: Process Creation

fork() creates a new process by cloning the calling process. The child process is an almost exact copy of the parent: same code, same data, same open files. The only difference is the return value of fork(): 0 in the child, the child's PID in the parent.

/* Fork implementation (simplified) */
int sys_fork(void) {
    struct pcb *parent = get_current_process();

    /* Allocate a new PCB */
    struct pcb *child = alloc_pcb();
    child->pid = allocate_pid();
    child->parent_pid = parent->pid;
    child->state = PROCESS_READY;
    child->priority = parent->priority;
    child->asid = allocate_asid();

    /* Clone kernel stack and copy context */
    child->stack_bottom = (uint64_t)kmalloc(KERNEL_STACK_SIZE);
    child->stack_top = child->stack_bottom + KERNEL_STACK_SIZE;
    memcpy((void *)child->stack_bottom, (void *)parent->stack_bottom, KERNEL_STACK_SIZE);

    /* Adjust the context stack pointer to the new stack */
    uint64_t offset = child->stack_bottom - parent->stack_bottom;
    child->context.sp += offset;

    /* Clone page tables (copy-on-write) */
    child->page_table = clone_page_table_cow(parent->page_table);

    /* Clone file descriptors */
    child->fds = clone_fd_table(parent->fds);

    /* Add to process tree */
    child->parent = parent;
    child->sibling = parent->children;
    parent->children = child;

    /* Make child runnable */
    scheduler_enqueue(child);

    /* Return child PID to parent, 0 to child */
    return child->pid;
}

23.4 Copy-on-Write Fork

A naive fork copies all physical pages. This is expensive, especially when the child immediately calls exec(). Copy-on-Write (COW) optimizes fork by sharing pages between parent and child until one of them writes to a page. When a write occurs, the page fault handler (Chapter 17) allocates a new page and copies the data.

/* Clone page tables for COW */
uint64_t *clone_page_table_cow(uint64_t *src) {
    uint64_t *new_ttbr0 = alloc_page_table();

    for (int i = 0; i < 512; i++) {
        uint64_t pte = src[i];
        if (!(pte & PTE_VALID)) continue;

        if (pte & PTE_TABLE) {
            /* Recursively clone sub-table */
            uint64_t *sub_src = (uint64_t *)(pte & PTE_ADDR);
            uint64_t *sub_dst = clone_page_table_cow(sub_src);
            new_ttbr0[i] = ((uint64_t)sub_dst & PTE_ADDR) | PTE_VALID | PTE_TABLE;
        } else {
            /* Make page read-only in both parent and child */
            pte &= ~PTE_AP_RW;  /* Remove write permission */
            src[i] = pte;       /* Update parent PTE too! */
            new_ttbr0[i] = pte;
        }
    }
    return new_ttbr0;
}

23.5 Exec: Loading a New Program

exec() replaces the current process's address space with a new program loaded from an ELF file:

/* Exec system call (simplified) */
int sys_exec(const char *path) {
    struct pcb *proc = get_current_process();

    /* Free old address space */
    free_page_table(proc->page_table, 0);
    proc->page_table = alloc_page_table();

    /* Load ELF binary into address space */
    struct elf_file *elf = load_elf(path);
    if (!elf) return -1;

    /* Map each segment */
    for (int i = 0; i < elf->phnum; i++) {
        struct elf_phdr *ph = &elf->phdrs[i];
        if (ph->type != PT_LOAD) continue;

        /* Map segment at specified virtual address */
        for (uint64_t off = 0; off < ph->memsz; off += PAGE_SIZE) {
            uint64_t va = ph->vaddr + off;
            void *page = page_alloc();
            memcpy(page, elf->data + ph->offset + off, min(PAGE_SIZE, ph->filesz - off));
            map_page(proc->page_table, va, (uint64_t)page, 1, 1);
        }
    }

    /* Set up user stack */
    uint64_t user_stack_base = USER_STACK_TOP - USER_STACK_SIZE;
    for (uint64_t va = user_stack_base; va < USER_STACK_TOP; va += PAGE_SIZE) {
        void *page = page_alloc();
        map_page(proc->page_table, va, (uint64_t)page, 1, 1);
    }

    /* Set entry point and user SP */
    proc->context.pc = elf->entry;
    proc->context.sp = USER_STACK_TOP;

    free_elf(elf);
    return 0;
}

23.6 Process Termination

When a process calls exit(), the kernel:

  1. Marks the process as ZOMBIE (retains PCB for parent to read exit code)
  2. Frees all resources (page tables except PCB, file descriptors)
  3. Schedules parent for wakeup (so it can call wait())
  4. Orphans are reparented to the init process (PID 1)
/* Exit system call */
void sys_exit(int exit_code) {
    struct pcb *proc = get_current_process();
    proc->exit_code = exit_code;
    proc->state = PROCESS_ZOMBIE;

    /* Free resources except PCB */
    free_page_table(proc->page_table, 0);
    free_fd_table(proc->fds);

    /* Notify parent */
    if (proc->parent && proc->parent->state == PROCESS_BLOCKED) {
        scheduler_enqueue(proc->parent);  /* wake parent */
    }

    /* Reparent orphans to init */
    struct pcb *child = proc->children;
    while (child) {
        child->parent = init_process;
        child = child->sibling;
    }

    scheduler_dequeue(proc);  /* Remove from ready queue */
    scheduler_pick_next();    /* Switch to next process */
}

23.7 Process Tree

Processes form a tree. The root is the init process (PID 1), created by the kernel at boot. All processes descend from init. The ps command displays this hierarchy:

PID  PPID  NAME
  0     0  idle
  1     0  init
  2     1  shell
  3     2  cat
  4     1  httpd

23.8 Our Implementation

Our kernel's process management subsystem (proc/) provides:

  • Process creation: fork with copy-on-write optimization
  • Program loading: exec that loads ELF files into the address space
  • Process termination: exit with zombie state and parent notification
  • Process reaping: wait() to collect exit codes from children
  • PID management: bitmap allocator for PIDs (reuses low numbers)
  • Process tree: parent-child relationships, orphan reparenting to init
  • Init process: PID 1, created at boot, never exits, reaps orphans

23.9 Exercises

Exercise 1: COW Fault Handler

Implement the page fault handler case that distinguishes COW faults from segmentation faults. On a COW fault, allocate a new page, copy the data, and map it writable.

Exercise 2: Zombie Cleanup

Write a test: create a child process that exits immediately. Verify that wait() returns the correct exit code and that the zombie is freed.

23.10 Summary

A process is an instance of a running program with its own address space, resources, and execution state. fork() creates a new process by cloning the parent, using copy-on-write to avoid copying physical pages unnecessarily. exec() replaces the address space with a new program loaded from an ELF file. exit() terminates a process, leaving a zombie until the parent calls wait(). Processes form a tree rooted at init (PID 1). Our kernel implements all these operations with proper resource tracking, orphan reparenting, and PID management.