Architecture Reference
This is a reference for the computer architecture concepts that directly affect our kernel code. It is not a general introduction to computer architecture. Focus on the sections that are relevant to the subsystem you are working on.
Memory Layout (QEMU virt)
Address | Region | Size | Usage
------------------+----------------------+------------+-------------------
0x0000_0000 | ROM / flash | 64 MB | Firmware
0x0800_0000 | RAM start | 512+ MB | Main memory
0x4000_0000 | Kernel load address | variable | Our kernel binary
0x0900_0000 | UART (PL011) | 4 KB | Serial I/O
0x0901_0000 | RTC | 4 KB | Real-time clock
0x0902_0000 | GPIO | 4 KB | General purpose I/O
0x0800_0000 | GIC (interrupts) | 8 KB | Interrupt controller
Our kernel is loaded at 0x40000000 by QEMU. The UART is at 0x09000000.
The GIC (Generic Interrupt Controller) is at 0x08000000.
Booting a CPU
When the CPU powers on:
- Firmware runs at EL3, initializes hardware
- Firmware drops to EL2 and loads our kernel at
0x40000000 - CPU starts executing at the kernel entry point with MMU off, caches off
- Our
_startcode sets up the stack, clears BSS, and callskernel_main
The CPU starts in a simple state:
- MMU disabled (all addresses are physical)
- Data and instruction caches disabled
- Stack pointer uninitialized (we must set it)
- All interrupts masked (disabled)
Memory Hierarchy
Only the layers relevant to kernel development:
| Level | Our code touches it when... |
|---|---|
| Registers | Every instruction. Context switching must save/restore them. |
| L1/L2 Cache | Page table modifications require cache maintenance (DC, IC instructions). |
| RAM | All kernel data lives here. Our memory manager allocates it. |
| Disk | Our filesystem driver reads/writes it (future). |
Cache Maintenance in Kernel Code
When we modify page tables (which are in memory that may be cached), the CPU might have stale cached copies. We must flush caches explicitly:
/* After writing a new page table entry */
dsb sy /* ensure write completes */
tlbi vmalle1 /* invalidate TLB for EL1 */
dsb sy /* ensure TLB invalidate completes */
isb /* flush instruction pipeline */
Why Caches Matter
- Context switching: Switching processes evicts cache contents. Frequent switching = poor cache performance.
- Page table walks: The MMU reads page tables from memory. If they are cached, translations are faster.
- Device memory: Hardware registers must not be cached. We mark device memory as
Device-nGnRnEin page tables.
RISC vs CISC: Why It Matters to Us
ARM64 is RISC. The practical consequences for our kernel:
- Fixed instruction size: All instructions are 4 bytes. This simplifies the decoder but means more instructions to do complex operations.
- Load/store architecture: Only
LDR/STRaccess memory. Everything else works on registers. Our assembly code will clearly separate memory accesses from computation. - Many registers: 31 general-purpose registers means fewer spills to the stack. Good for performance.
Endianness
ARM64 is little-endian by default. When we read a 4-byte value from memory:
Value stored: 0x12345678
Memory bytes (low to high address):
0x78 0x56 0x34 0x12
This matters when:
- Reading disk superblocks or file system metadata
- Interpreting network packet headers (network byte order is big-endian)
- Writing device drivers with specific register layouts
Key Register-Level Details for Kernel Code
| Register | Role | Kernel usage |
|---|---|---|
| PC | Program Counter | Saved during context switch to resume the right instruction |
| SP_EL0 | User stack pointer | Saved/restored when switching between user and kernel mode |
| SP_EL1 | Kernel stack pointer | The kernel's own stack; separate from user stack |
| PSTATE | Processor State | Contains DAIF flags (interrupt masks), EL, etc. |
| ELR_EL1 | Exception Link Register | Holds the return address when an exception occurs at EL1 |
| SPSR_EL1 | Saved PSTATE | Holds the PSTATE value when an exception occurred |
When an exception or interrupt occurs, the CPU automatically saves the return address in
ELR_EL1 and the processor state in SPSR_EL1. Our exception handler
must save these before handling the exception, then restore them before returning.
Further Reading
- ARM Cortex-A Series Programmer's Guide for ARMv8-A (free from ARM)
- QEMU source:
hw/arm/virt.cfor the complete memory map - ARM Architecture Reference Manual ARMv8-A, sections D1 and D2 (system registers)