[Project Name] / Manual
🔍

Architecture Reference

This is a reference for the computer architecture concepts that directly affect our kernel code. It is not a general introduction to computer architecture. Focus on the sections that are relevant to the subsystem you are working on.

Memory Layout (QEMU virt)

Address           | Region               | Size       | Usage
------------------+----------------------+------------+-------------------
0x0000_0000       | ROM / flash          | 64 MB      | Firmware
0x0800_0000       | RAM start            | 512+ MB    | Main memory
0x4000_0000       | Kernel load address  | variable   | Our kernel binary
0x0900_0000       | UART (PL011)         | 4 KB       | Serial I/O
0x0901_0000       | RTC                  | 4 KB       | Real-time clock
0x0902_0000       | GPIO                 | 4 KB       | General purpose I/O
0x0800_0000       | GIC (interrupts)     | 8 KB       | Interrupt controller

Our kernel is loaded at 0x40000000 by QEMU. The UART is at 0x09000000. The GIC (Generic Interrupt Controller) is at 0x08000000.

Booting a CPU

When the CPU powers on:

  1. Firmware runs at EL3, initializes hardware
  2. Firmware drops to EL2 and loads our kernel at 0x40000000
  3. CPU starts executing at the kernel entry point with MMU off, caches off
  4. Our _start code sets up the stack, clears BSS, and calls kernel_main

The CPU starts in a simple state:

  • MMU disabled (all addresses are physical)
  • Data and instruction caches disabled
  • Stack pointer uninitialized (we must set it)
  • All interrupts masked (disabled)

Memory Hierarchy

Only the layers relevant to kernel development:

LevelOur code touches it when...
RegistersEvery instruction. Context switching must save/restore them.
L1/L2 CachePage table modifications require cache maintenance (DC, IC instructions).
RAMAll kernel data lives here. Our memory manager allocates it.
DiskOur filesystem driver reads/writes it (future).

Cache Maintenance in Kernel Code

When we modify page tables (which are in memory that may be cached), the CPU might have stale cached copies. We must flush caches explicitly:

/* After writing a new page table entry */
dsb sy              /* ensure write completes */
tlbi vmalle1        /* invalidate TLB for EL1 */
dsb sy              /* ensure TLB invalidate completes */
isb                 /* flush instruction pipeline */

Why Caches Matter

  • Context switching: Switching processes evicts cache contents. Frequent switching = poor cache performance.
  • Page table walks: The MMU reads page tables from memory. If they are cached, translations are faster.
  • Device memory: Hardware registers must not be cached. We mark device memory as Device-nGnRnE in page tables.

RISC vs CISC: Why It Matters to Us

ARM64 is RISC. The practical consequences for our kernel:

  • Fixed instruction size: All instructions are 4 bytes. This simplifies the decoder but means more instructions to do complex operations.
  • Load/store architecture: Only LDR/STR access memory. Everything else works on registers. Our assembly code will clearly separate memory accesses from computation.
  • Many registers: 31 general-purpose registers means fewer spills to the stack. Good for performance.

Endianness

ARM64 is little-endian by default. When we read a 4-byte value from memory:

Value stored: 0x12345678

Memory bytes (low to high address):
  0x78  0x56  0x34  0x12

This matters when:

  • Reading disk superblocks or file system metadata
  • Interpreting network packet headers (network byte order is big-endian)
  • Writing device drivers with specific register layouts

Key Register-Level Details for Kernel Code

RegisterRoleKernel usage
PCProgram CounterSaved during context switch to resume the right instruction
SP_EL0User stack pointerSaved/restored when switching between user and kernel mode
SP_EL1Kernel stack pointerThe kernel's own stack; separate from user stack
PSTATEProcessor StateContains DAIF flags (interrupt masks), EL, etc.
ELR_EL1Exception Link RegisterHolds the return address when an exception occurs at EL1
SPSR_EL1Saved PSTATEHolds the PSTATE value when an exception occurred

When an exception or interrupt occurs, the CPU automatically saves the return address in ELR_EL1 and the processor state in SPSR_EL1. Our exception handler must save these before handling the exception, then restore them before returning.

Further Reading

  • ARM Cortex-A Series Programmer's Guide for ARMv8-A (free from ARM)
  • QEMU source: hw/arm/virt.c for the complete memory map
  • ARM Architecture Reference Manual ARMv8-A, sections D1 and D2 (system registers)