Chapter 30: Filesystems
- What a filesystem is and what it does
- How files and directories are organized on disk
- The structure of a simple filesystem: superblock, inodes, data blocks
- File operations: open, read, write, close
- Directory structures and path resolution
- Our simple FS implementation
30.1 What is a Filesystem?
A filesystem is a data structure stored on a block device (disk, SD card, SSD) that organizes files and directories. It provides persistent storage: data that survives power loss. The filesystem manages:
- Files: named sequences of bytes
- Directories: containers that map names to files or subdirectories
- Metadata: file size, timestamps, permissions, ownership
- Free space: tracking which blocks are available
30.2 Simple FS Disk Layout
Our simple filesystem (sfs) uses this disk layout:
Block 0: Boot sector (reserved)
Block 1: Superblock (filesystem parameters)
Block 2-63: Inode bitmap (8192 inodes)
Block 64-127: Block bitmap (65536 blocks)
Block 128-...: Inode table (32 blocks, 64 inodes per block = 2048 inodes)
... Data blocks (rest of disk)
/* Superblock (on disk at block 1) */
struct sfs_superblock {
uint32_t magic; /* 0x53465300 ("SFS\0") */
uint32_t block_size; /* 512 bytes */
uint32_t num_blocks; /* Total blocks on device */
uint32_t inode_count; /* Total inodes */
uint32_t inode_bitmap_blk;/* Block where inode bitmap starts */
uint32_t block_bitmap_blk;/* Block where block bitmap starts */
uint32_t inode_table_blk; /* Block where inode table starts */
uint32_t first_data_blk; /* First data block */
};
/* Inode (on disk in inode table) */
struct sfs_inode {
uint16_t mode; /* File type & permissions */
uint16_t uid; /* Owner user ID */
uint32_t size; /* File size in bytes */
uint32_t ctime; /* Creation timestamp */
uint32_t mtime; /* Modification timestamp */
uint16_t gid; /* Owner group ID */
uint16_t links; /* Number of hard links */
uint32_t blocks[12]; /* Direct block pointers */
uint32_t indirect; /* Single indirect block pointer */
uint32_t double_indirect; /* Double indirect block pointer */
};
/* Directory entry (on disk in a directory's data blocks) */
struct sfs_dirent {
uint32_t inode; /* Inode number */
uint16_t rec_len; /* Record length */
uint8_t name_len; /* Name length */
uint8_t file_type; /* File type */
char name[255]; /* File name */
};
30.3 File Operations
/* Open a file by inode number */
struct file *sfs_open(struct sfs_fs *fs, uint32_t inum) {
struct sfs_inode *inode = sfs_read_inode(fs, inum);
if (!inode) return NULL;
struct file *f = kmalloc(sizeof(struct file));
f->inode = inum;
f->pos = 0;
f->size = inode->size;
f->mode = inode->mode;
return f;
}
/* Read from a file */
ssize_t sfs_read(struct sfs_fs *fs, struct file *f, void *buf, size_t count) {
ssize_t total = 0;
while (total < count && f->pos < f->size) {
/* Calculate block number and offset within block */
uint32_t block_idx = f->pos / fs->block_size;
uint32_t block_off = f->pos % fs->block_size;
uint32_t phys_block = sfs_bmap(fs, f->inode, block_idx);
/* Read the block */
uint8_t block_buf[fs->block_size];
sfs_read_block(fs, phys_block, block_buf);
/* Copy data to user */
size_t chunk = min(count - total, fs->block_size - block_off);
chunk = min(chunk, f->size - f->pos);
memcpy(buf + total, block_buf + block_off, chunk);
f->pos += chunk;
total += chunk;
}
return total;
}
/* Write to a file */
ssize_t sfs_write(struct sfs_fs *fs, struct file *f, const void *buf, size_t count) {
ssize_t total = 0;
while (total < count) {
uint32_t block_idx = f->pos / fs->block_size;
uint32_t block_off = f->pos % fs->block_size;
/* Allocate a block if needed */
uint32_t phys_block = sfs_bmap(fs, f->inode, block_idx);
if (!phys_block) {
phys_block = sfs_alloc_block(fs);
sfs_bmap_set(fs, f->inode, block_idx, phys_block);
}
uint8_t block_buf[fs->block_size];
sfs_read_block(fs, phys_block, block_buf);
size_t chunk = min(count - total, fs->block_size - block_off);
memcpy(block_buf + block_off, buf + total, chunk);
sfs_write_block(fs, phys_block, block_buf);
f->pos += chunk;
total += chunk;
}
if (f->pos > f->size) {
f->size = f->pos;
/* Update inode size on disk */
sfs_update_inode_size(fs, f->inode, f->size);
}
return total;
}
30.4 Directory Operations
/* Look up a name in a directory */
uint32_t sfs_dir_lookup(struct sfs_fs *fs, uint32_t dir_inum, const char *name) {
struct file *dir = sfs_open(fs, dir_inum);
struct sfs_dirent dent;
while (sfs_read(fs, dir, &dent, sizeof(dent)) == sizeof(dent)) {
if (strncmp(dent.name, name, dent.name_len) == 0 && strlen(name) == dent.name_len) {
sfs_close(dir);
return dent.inode;
}
}
sfs_close(dir);
return 0; /* Not found */
}
/* Add an entry to a directory */
int sfs_dir_add(struct sfs_fs *fs, uint32_t dir_inum, const char *name, uint32_t inum) {
struct sfs_dirent dent;
dent.inode = inum;
dent.name_len = strlen(name);
dent.rec_len = sizeof(dent);
dent.file_type = 0;
memcpy(dent.name, name, dent.name_len);
struct file *dir = sfs_open(fs, dir_inum);
sfs_write(fs, dir, &dent, sizeof(dent));
sfs_close(dir);
return 0;
}
30.5 Our Implementation
Our kernel implements a simple filesystem (sfs) for bootable disk images. Key features:
- Block size: 512 bytes (matches QEMU virtio block device)
- Inodes: 12 direct blocks + 1 single indirect + 1 double indirect (max file size ~8 MB)
- Bitmaps: separate bitmaps for inodes and data blocks
- Directory: linear scan of directory entries (O(n) lookup, simple for small dirs)
- Root directory: inode 2 (Unix convention, inode 1 is bad blocks)
- Root filesystem: mounted at boot from the boot partition on the SD card or virtio disk
Future improvements: ext2-style block groups, HTree directories for fast lookup, journalling for crash safety.
30.6 Exercises
Exercise 1: File Size Limits
Calculate the maximum file size given 12 direct blocks, a single indirect block, and a double indirect block. Assume 512-byte blocks and 4-byte block pointers.
Exercise 2: Disk Format Tool
Write a user-space tool (mkfs.sfs) that creates an SFS filesystem image on a file. It should write the superblock, bitmaps, inode table, and root directory.
30.7 Summary
A filesystem organizes persistent data on block devices. Our simple FS uses a classic Unix-style layout: superblock with parameters, bitmaps for free space tracking, an inode table for file metadata, and data blocks for content. Files are represented as inodes with direct and indirect block pointers. Directories are simple files containing name-to-inode mappings. File operations (open, read, write) translate file positions to physical block numbers through the inode's block map.