File System

In the last project, you have seen how to read and write disk blocks, and you will now learn about how to build a file system. A file system controls the disk and provides the interface of files and directories (aka. folders) to end users, which you must have been familiar with through your everyday interaction with operating systems. There are numerous file systems and many of them are based on an abstraction called inode. We thus start by explaining this important file system abstraction.

Inode

Consider an inode as a number of blocks and a file system as an array of inodes. Say a file system maintains 1024 inodes and each inode holds the information of a file or a directory. For example, inodes in a file system could be used as follows.

Inode #12 holds 1 disk block for a small text file.
Inode #103 holds 31 disk blocks for the executable file of the cat user command.
Inode #781 holds 1 disk block for the /bin directory recording the information of all files under /bin. This directory typically holds executable files of user commands.

The mkfs tool gives a simple and concrete example of how inodes could be organized.

Use of inodes in `mkfs`

When you type make install or make qemu in egos-2000, the command will compile and run tools/mfks.c which produces a disk image file. Below is part of its output.

shell

> make install
...
[INFO] Load ino=0, 34 bytes
[INFO] Load ino=1, 48 bytes
[INFO] Load ino=2, 26 bytes
[INFO] Load ino=3, 15 bytes
[INFO] Load ino=4, 15 bytes
[INFO] Load ino=5, 348 bytes
[INFO] Load ino=7, cat.elf: 15600 bytes
[INFO] Load ino=8, vga_demo.elf: 9364 bytes
[INFO] Load ino=9, loop.elf: 11188 bytes
[INFO] Load ino=10, cd.elf: 15644 bytes
[INFO] Load ino=11, crash1.elf: 16772 bytes
[INFO] Load ino=12, crash2.elf: 9432 bytes
[INFO] Load ino=13, ls.elf: 10408 bytes
[INFO] Load ino=14, udp_demo.elf: 14760 bytes
[INFO] Load ino=15, echo.elf: 10144 bytes
[INFO] Load ino=6, ./   6 ../   0 cat   7 vga_demo   8 loop   9 cd  10 crash1  11 crash2  12 ls  13 udp_demo  14 echo  15
...

Let's take a closer look at these 16 inodes numbered from #0 to #15.

Inodes for directories

Inode #0..#4 and #6 are used for directories. The table below summarizes how the code in tools/mfks.c uses these inodes.

Inode Number	Corresponding directory	Bytes stored in the inode
#0	`/`	`./ 0 ../ 0 home/ 1 bin/ 6`
#1	`/home`	`./ 1 ../ 0 yunhao/ 2 rvr/ 3 yacqub/ 4`
#2	`/home/yunhao`	`./ 2 ../ 1 README 5`
#3	`/home/rvr`	`./ 3 ../ 1`
#4	`/home/yacqub`	`./ 4 ../ 1`
#6	`/bin`	`./ 6 ../ 0 cat 7 vga_demo 8 loop 9 cd 10 crash1 11 crash2 12 ls 13 udp_demo 14 echo 15`

First, every directory contains ./ and ../ corresponding to itself and its parent directory. For example, the parent directory of both /bin and /home is /, illustrated by the ../ 0 part of inode #1 and #6. In general, egos-2000 uses the format {name} {inode no} ... to represent a directory. If a {name} ends with /, this name would represent a sub-directory (e.g., rvr/ in inode #1). Otherwise, this name represents a file (e.g., README in inode #2). We use this convention for directories just because it is simple. In most file systems, these inodes would record more information such as file size or file creation time.

All the inodes in this table hold fewer than 512 bytes, so they all hold only one disk block.

Inodes for files

Inode #5 and #7..#15 are used for files. For example, as shown in the table above, inode #5 is for /home/yunhao/README which is a short text file. The content of this text file is defined by the 348 bytes of contents[5] in tools/mkfs.c. Since 348 is smaller than the 512-byte block size, inode #5 also holds only one disk block for this file.

As shown in the mkfs printing above, inode #7..#15 are used for the binary executable files of user commands, while these inodes hold more than one block because these executable files are larger than 512 bytes. For example, user command echo corresponds to inode #15 which holds 10144 bytes (i.e., ceil(10144/512)=20 disk blocks).

Lastly, NINODES is defined as 128 in header file library/file/inode.h, and mkfs uses NINODES to initialize the file system. This means that the file system supports 128 inodes, but only 16 of them are used when egos-2000 first starts to boot.

TIP

Our goal is to give you a concrete and minimal example of how a file system could be organized as an array of inodes. Make sure that you see how different inodes can hold different numbers of disk blocks and how an inode can be used as either a file or a directory. In the rest of P6, we will focus on the inode abstraction and ignore how a file system maps files or directories into inodes.

The inode store interface

In addition to NINODES, the header file library/file/inode.h defines inode_intf.

typedef struct inode_store* inode_intf; /* inode store interface */

struct inode_store {
    int (*getsize)(inode_intf self, uint ino);
    int (*setsize)(inode_intf self, uint ino, uint newsize);
    int (*read)(inode_intf self, uint ino, uint offset, block_t* block);
    int (*write)(inode_intf self, uint ino, uint offset, block_t* block);
    void* state;
};

In short, inode_store defines the file system interface in egos-2000 and there are two file systems (i.e., file0.c and file1.c under library/file) implementing this interface. In particular, each file system implements 4 functions read, write, getsize and setsize, as well as maintaining a certain state in the memory. Given that a pointer holds 4 bytes, struct inode_store simply holds 20 bytes and the first 16 bytes hold the addresses of the 4 interface functions (i.e., the address of their first instruction).

If you are familiar with object-oriented programming, you may consider inode_store as an interface (aka. template), and a file system as a class with a local variable and four methods implementing this interface. This explains the self argument of the interface functions, so a file system could access its in-memory state with self->state. In C, self simply holds 4 bytes as the address of the 20-byte struct inode_store.

For read and write, there are 3 more arguments: ino is the inode number, offset is the block number, and block is the memory address of 512 bytes holding the block data. For example, as we have seen, the executable of echo maps to inode #15 which holds 20 blocks. We can thus read inode #15 with an offset in range 0..19.

// Variable block holds 512 bytes in the memory
block_t block;
// Given `inode_intf fs`, read the 5th block from block 0..19 of inode #15
fs->read(fs, 15, 4, &block);

A simple file system

We now explain a simple file system in library/file/file0.c which implements the inode store interface. We start with the mydisk_init function.

inode_intf mydisk_init(inode_intf below, uint below_ino) {
    inode_intf self = malloc(sizeof(struct inode_store));
    self->getsize   = mydisk_getsize;
    self->setsize   = mydisk_setsize;
    self->read      = mydisk_read;
    self->write     = mydisk_write;
    self->state     = below;
    return self;
}

The code allocates 20 bytes for the inode store interface, and sets the 4 interface functions as the mydisk_* functions right before mydisk_init. Then below is recorded as self->state since the file system will need below for reading and writing the disk. Specifically, in apps/system/sys_file.c, there is another inode store called disk which simply reads or writes the disk with an offset while ignoring the inode number. A pointer to this disk variable is passed to mydisk_init as the below argument. This disk (or self->state) is used in functions mydisk_read and mydisk_write as follows.

#define DUMMY_DISK_OFFSET(ino, offset) ino * 128 + offset

int mydisk_read(inode_intf self, uint ino, uint offset, block_t* block) {
    inode_intf below = self->state;
    return below->read(below, 0, DUMMY_DISK_OFFSET(ino, offset), block);
}

int mydisk_write(inode_intf self, uint ino, uint offset, block_t* block) {
    inode_intf below = self->state;
    return below->write(below, 0, DUMMY_DISK_OFFSET(ino, offset), block);
}

This simple file system assumes that every inode holds at most 128 blocks, and operations to (ino, offset) are directly translated to reading or writing disk block ino*128+offset. For now, the mydisk_getsize and mydisk_setsize functions simply print out an error and halt, meaning that they could be useful but not being used in egos-2000.

TIP

You will start from here and implement your own file system within the mydisk_* functions. Make sure to set FILESYS = 0 in Makefile, so mkfs and egos-2000 will run your file system code in file0.c. Specifically, after make install, you should see MKFS is using *mydisk*.

A more structured file system

Your job in this project is to implement some more sophisticated data structurs for the inode store interface functions. When you modify such data structures, you typically need to read a disk block into memory, apply your modifications in the memory, and write the block back from memory to the disk. The data structures here are not very complex, but students often mess up what is in the memory and what is on the disk, leading to subtle bugs.

Specifically, you will implement a few linked lists on top of the disk which is similar to the FAT file system. We now explain how you would maintain the inodes, a linked list for each inode, and a linked list for all the unused blocks on the disk.

Linked list

Suppose inode #28 has 3 blocks being #100, #202 and #432 on the disk. We can then use a simple linked list data structure to represent this inode as follows.

struct inode {
    int size, head;
} inodes[128];

inodes[28].size = 3;
inodes[28].head = 100;

struct entry {
    int next;
} FAT_entry[DISK_SIZE / BLOCK_SIZE];

FAT_entry[100]  = 202;
FAT_entry[202]  = 432;
FAT_entry[432]  = -1;

We call it FAT_entry because it is similar to how the FAT file system maintains a linked list. However, if you define inodes and FAT_entry in C like above, they will be in the memory. Instead, we should put these two arrays on the disk.

Create the disk layout

This picture shows how you could put the linked list data structures on the disk.

Failed to load picture

We split the disk into 4 regions. The first block is called the super block. The next I blocks are called inode blocks, holding the inodes array. The next F blocks are called FAT table blocks, holding the FAT_entry array. All the other blocks starting from I+F+1 are used as data blocks. Given the size of struct inode and struct entry, an inode block holds 64 struct inode and a FAT table block holds 128 struct entry.

Therefore, for FAT_entry[100] = 202, we can read block I+1 into memory as an array of 128 entries, modify entry #100, and write it back to the disk. Similarly, for FAT_entry[202], we should read block I+2 into memory, modify entry #74 (i.e., 74=202%128), and write it back to the disk. Modifying the inodes array with disk blocks 1..I is similar.

Given the explanations above, your first task is to implement the mydisk_create function. This function initializes a file system on the disk, a process also known as disk formatting. Specifically, you will do the following.

Calculate I and F based on the number of inodes and the disk size.
Set the size of all the inodes to 0; Set the head of all the inodes to -1.
Create one linked list containing blocks I+F+1..N, which is called the free list.
Define a data structure for the super block which records I, F, as well as the head of the free list; Write to the super block with this data structure.
You don't have to touch the data blocks.

After the above steps, we will have a file system with all the inodes empty and all the data blocks in the free list. You may also read the treedisk_create function in library/file/file1.c as a reference.

TIP

Many file systems maintain a tree data structure instead of linked lists. The most widely-used one is probably B-tree which has a query complexity much lower than linked lists. Indeed, our solution for this project runs slower than treedisk on QEMU, reflecting the difference in complexity.

Read and write an inode

Your next task is to implement mydisk_read and mydisk_write. Intuitively, mydisk_read involves the following steps.

Read an inode block and find the struct inode corresponding to the ino argument.
Make sure that offset is smaller than the inode size.
Traverse from the start of the linked list (i.e., inode head) by offset times, and find the disk block number for the data block.
Read the data block into the block argument of mydisk_read.

The steps for mydisk_write are similar with one key difference. If the offset argument is larger than (or equal to) the inode size, you need to remove some blocks from the free list and add them to the linked list of the inode. This is likely the trickiest part in this project and make sure that you have updated all 4 disk regions correctly.

After implementing these two functions, egos-2000 should be able to run normally.

Get and set inode size

Your last task is to implement mydisk_getsize and mydisk_setsize. getsize should be easy since it is a subset of the read function. For setsize, you can start by implementing the case of nblocks being 0. Then you can think about how to design helper functions that are shared by setsize and write when nblocks is not 0.

Since egos-2000 does not use these functions, we provide a separate testing framework.

Test your file system code

We provide a testing framework for your file system code in this repo. Specifically, you can replace the file0.c in this repo with your code and follow the instructions in the README. To write a test, you write a trace file similar to the trace.txt in this repo:

txt

W:0:0:123456789
W:1:0:1234567
W:1:1:333333
R:0:0:123456789
R:1:0:1234567
R:1:1:333333

W:1:0:1234567 means to issue a write operation to your file system with inode #1, offset 0, and block data filled by 1234567. R:1:0:1234567 means to issue a read operation to your file system with inode #1 and offset 0, and then check whether the block data returned by your file system still holds 1234567.

You should write a lot more test traces that are more complex than trace.txt which would test your file system code comprehensively. Try to examine all the corner cases.

Accomplishments

You have gained hands-on experience with the abstraction of inode, file, and directory. In terms of code reading, you have finished reading library/file and tools/mkfs.c. You may also spend a bit of time reading apps/system/sys_file.c which serves all file system requests as a system process in egos-2000.

File System ​

Inode ​

Use of inodes in mkfs ​

Inodes for directories ​

Inodes for files ​

The inode store interface ​

A simple file system ​

A more structured file system ​

Linked list ​

Create the disk layout ​

Read and write an inode ​

Get and set inode size ​

Test your file system code ​

Accomplishments ​