System Call & Protection
You have seen how an operating system manages the life cycles of threads (or processes, since we don't distinguish between them yet) and preemptively schedules multiple threads using timer interrupts. This project now helps you understand two more things.
- How do threads invoke system calls to communicate with each other?
- How does an operating system protect its memory, so malicious threads cannot corrupt the system calls by modifying the code or data of the operating system?
System calls and memory protection are supported by exception handling, which is similar to interrupt handling. To begin, we introduce exception handling.
Exception handling
An exception happens if something goes wrong when the CPU executes an instruction. For example, an exception occurs when an instruction attempts to access memory at an invalid address. Instead of ignoring the problem and proceeding to the next instruction, the CPU automatically jumps to a special function called an exception handler, just as it jumps to the interrupt handler when receiving a timer interrupt. We will use the same function to handle both interrupts and exceptions, and use the mcause CSR to show why the CPU jumped to the handler.
The mcause CSR
Below are screenshots of Tables 22 and 23 from this manual, which describe the mcause CSR. You can also read Chapter 3.1.15 of the RISC-V Reference Manual. When an exception or interrupt occurs, the CPU sets mcause before jumping to the handler.

For example, mcause is set to 0x80000007 when the CPU receives a timer interrupt. Bit#31 of mcause is set to 1 because the timer interrupt is an interrupt. The exception code is set to 0b0000000111 because 7 is the machine timer interrupt code. Similarly, when the CPU encounters an illegal instruction, meaning the 4 bytes pointed to by the program counter cannot be decoded into a CPU instruction, mcause is set to 0x2 before the CPU jumps to the handler.
TIP
Exceptions are different from interrupts. Exceptions are triggered by CPU instructions that cause something wrong. Interrupts are triggered by devices outside of the CPU, such as a timer, a disk, or a network interface controller. The mcause CSR helps the operating system understand which interrupt or exception needs to be handled.
The ecall exception
Most exceptions occur due to errors, but RISC-V provides a special instruction, ecall, which intentionally triggers the so-called environment call exception (see exception #8 and #11 in Table 23). This instruction is for system calls: when a thread invokes ecall, the control flow transfers to the operating system, which serves a system call for that thread.
You can find ecall in library/syscall/syscall.c of egos-2000. After asm("ecall") raises an environment call exception, the CPU jumps to trap_entry, as you saw in P2. trap_entry calls kernel_entry, which calls excp_entry, and excp_entry handles system calls in this condition: (id >= EXCP_ID_ECALL_U && id <= EXCP_ID_ECALL_M). EXCP_ID_ECALL_U and EXCP_ID_ECALL_M are defined as 8 and 11, respectively, according to Table 23. Read trap_entry and kernel_entry yourself and see how mcause is used.
TIP
The U-Mode and M-mode in Table 23 stand for user mode and machine mode. We will cover these privilege modes very soon when we start explaining memory protection later in this project.
A sketch of the "kernel"
We have been using the term kernel since P2, but we have never explained what it is. Now that you know the mcause CSR, we can show you a sketch of the kernel.
void kernel() {
int mcause_val, id;
asm("csrr %0, mcause" : "=r"(mcause_val));
id = mcause_val & 0x3FF;
if (mcause_val & (1 << 31)) {
if (id == 7) proc_yield();
} else {
if (id >= 8 && id <= 11) handle_system_call();
if (id == 1 || id == 5 || id == 7) handle_memory_access_fault();
}
}The code above sketches the core of an OS kernel:
- handle thread scheduling upon a timer interrupt
- handle system calls when a thread invokes
ecall - handle other exceptions such as memory access faults
This is just a sketch, since a complete OS must handle all interrupts and exceptions; these three items are probably the most important. You have seen thread scheduling in P2, and P3 will give you hands-on experience with system calls and memory access faults.
Inter-process communication
There are only 2 types of system calls in egos-2000. They are designed for inter-process communication, meaning sending and receiving messages. Next, we introduce the system call interface for applications, and then explain what happens within the OS kernel.
Application-side interface
The code below is from library/syscall/syscall.h, and it defines the data structures for system calls in egos-2000.
enum syscall_type {
SYS_RECV = 1,
SYS_SEND = 2,
};
struct syscall {
enum syscall_type type; /* SYS_SEND or SYS_RECV */
int sender; /* sender process ID */
int receiver; /* receiver process ID */
char content[1024];
enum { PENDING, DONE } status;
};The content field holds the message being sent or received. Say process A wants to send a message to process B through SYS_SEND; this system call may not succeed immediately. It succeeds only after process B invokes SYS_RECV with process A as the sender, meaning process B is ready to receive a message from process A. For this reason, before process B invokes SYS_RECV, the SYS_SEND system call made by process A has a status of PENDING rather than DONE.
TIP
In other words, egos-2000 implements a blocking version of inter-process communication, so a system call returns only after a message has been successfully sent or received. It is also possible to implement a non-blocking version in which system calls return immediately. egos-2000 uses the blocking version for code simplicity.
With this struct syscall in mind, the system call interface sys_send and sys_recv in library/syscall/syscall.c should be easy to understand.
static struct syscall* sc = (struct syscall*)SYSCALL_ARG;
void sys_send(int receiver, char* msg, uint size) {
sc->type = SYS_SEND;
sc->receiver = receiver;
memcpy(sc->content, msg, size);
asm("ecall");
}
void sys_recv(int from, int* sender, char* buf, uint size) {
sc->type = SYS_RECV;
sc->sender = from;
asm("ecall");
memcpy(buf, sc->content, size);
if (sender) *sender = sc->sender;
}Again, the ecall highlighted above triggers an environment call exception, and the CPU then jumps to the exception handler, trap_entry, right after the ecall instruction.
Kernel-side handling
The exception handler in egos-2000, trap_entry, calls kernel_entry(), which then calls excp_entry() after an environment call exception is raised by ecall. We now explain the following if-statement for system calls.
if (id >= EXCP_ID_ECALL_U && id <= EXCP_ID_ECALL_M) {
/* Copy the system call arguments from user space to the kernel. */
uint syscall_paddr = earth->mmu_translate(curr_pid, SYSCALL_ARG);
memcpy(&proc_set[curr_proc_idx].syscall, (void*)syscall_paddr,
sizeof(struct syscall));
proc_set[curr_proc_idx].syscall.status = PENDING;
proc_set_pending(curr_pid);
proc_set[curr_proc_idx].mepc += 4;
proc_try_syscall(&proc_set[curr_proc_idx]);
proc_yield();
return;
}First of all, proc_set[curr_proc_idx] is the struct representing the current process that has just invoked the system call.
Note that this process initialized the syscall struct at memory address
SYSCALL_ARG. Lines #3 and #4 copy the data structure into the kernel. You can ignoreearth->mmu_translatefor now; we will explain it in P4. Line #6 sets the system call status toPENDING, and line #8 sets the process status toPROC_PENDING_SYSCALL.Recall that
mepcstands for the program counter when the exception occurs, and the value ofmepcis read intoproc_set[curr_proc_idx].mepcin functionkernel_entry. Therefore, line #9 says that after the system call completes, the kernel should return to the instruction immediately afterecall(i.e., skip the 4-byteecallinstruction).Line #10 attempts to process the
SYS_SENDorSYS_RECVsystem call for the current process, and line #11 finds the next process to schedule.proc_try_syscallis also called inproc_yieldbecause the scheduler repeatedly attempts to process a pending system call until it succeeds.
Read proc_try_syscall(), proc_try_send() and proc_try_recv(). There are only ~40 lines of code, but they gracefully handle inter-process communication. At a high level, if a process has a pending system call, proc_try_syscall() will retry the system call and, if it succeeds, set the process status to PROC_RUNNABLE.
TIP
At this point, you have finished reading grass/kernel.c, including kernel_entry, intr_entry, excp_entry, proc_yield, proc_try_syscall, proc_try_send, and proc_try_recv.
Introduce process sleep
After getting familiar with the system call control flow, we now ask you to use system calls to introduce process sleep. As shown in library/syscall/servers.h, the GPID_PROCESS process in egos-2000 accepts 3 message types to spawn and terminate processes. Your job is to add a fourth message type, PROC_SLEEP, to the struct proc_request so that the process that sends this message to GPID_PROCESS will sleep for a specified amount of time before being scheduled again. Start from a fresh copy of egos-2000, and add the following code as a new file apps/user/sleep.c.
#include "app.h"
int main() {
const uint usec_cnt = 5000000;
printf("Start to sleep for %d microseconds.\n\r", usec_cnt);
sleep(usec_cnt);
printf("Woke up again after %d microseconds.\n\r", usec_cnt);
}Then run make qemu, and sleep will be automatically added as a user command:
> make qemu
...
➜ /home/yunhao sleep
Start to sleep for 5000000 microseconds.
Woke up again after 5000000 microseconds.For now, you will see the second line printed immediately after the first because the sleep function in library/syscall/servers.c has not been implemented. You shall see the second line 5 seconds after the first line when you complete the following steps.
Update the
struct proc_requestand thesleepfunction mentioned above so that thissleepfunction sends aPROC_SLEEPmessage to theGPID_PROCESSprocess.In
apps/system/sys_proc.c, add a case forPROC_SLEEPand put debug printing there temporarily, so you know thatGPID_PROCESSsucceeds in receiving the message.Add the
proc_sleepfunction ingrass/process.cto the grass layer interface (struct grassinlibrary/egos.h), and initialize it ingrass/init.c, just likeproc_alloc.Invoke
grass->proc_sleepin thePROC_SLEEPcase you have just added in step 2. Then add debug printing in theproc_sleepfunction ofgrass/process.c, so you know thatproc_sleepis called byGPID_PROCESSwith the correctpidandusecarguments.Implement this
proc_sleepfunction, which should put the process identified bypidto sleep forusecmicroseconds. This involves a few modifications to the kernel.- Add one or more fields to
struct process, and initialize them inproc_alloc(). - Modify these fields for process
pidinproc_sleep(). In addition to argumentusec, you needmtime_get(), which returns the clock time in 10^-7 seconds (on QEMU). - Modify
proc_yield()to schedule a process only if it is not sleeping, using the fields instruct processand the latest clock time frommtime_get().
- Add one or more fields to
The kernel may now encounter a situation in which no process can be scheduled. You need to handle this situation by replacing the
FATALinproc_yield()with your code.Remove the debug printings. Run
sleepagain in the egos-2000 shell, and you shall see theWoke up ...printing 5 seconds after the first line of printing.
Protect the OS memory
By far, all the code we have seen runs in the so-called machine mode, which means it can freely access memory. However, user applications should not be able to read or write memory freely. Otherwise, a malicious application can corrupt the kernel's memory, causing damage. At a high level, we now ask you to do 3 things:
- Specify the memory regions that code in the user mode is allowed to access.
- Run the code of all user applications in user mode rather than machine mode.
- Terminate a user application if it triggers an exception by reading or writing the memory at an address outside of the allowed regions.
Set up a PMP region
Read through chapter 3.7 of the RISC-V reference manual for Physical Memory Protection (PMP), and then write your code in earth/cpu_mmu.c:
void mmu_init() {
/* Setup a PMP region for the whole 4GB address space. */
asm("csrw pmpaddr0, %0" : : "r"(0x40000000));
asm("csrw pmpcfg0, %0" : : "r"(0xF));
/* Student's code goes here (System Call & Protection). */
/* Replace the PMP region above with a NAPOT region 0x80200000 - 0x80400000
* and set the permission for user mode access as r/w/x. */
/* Student's code ends here. */
...
}TIP
Your code should overwrite the two CSRs pmpaddr0 and pmpcfg0, so the 4GB region no longer takes effect and is replaced by the 2MB region [0x80200000, 0x80400000). As a result, you will only be able to choose software TLB when booting egos-2000 in the rest of P3.
Specifically, code running in the user mode cannot access any memory region by default. After you finish the PMP code above, code running in user mode will be able to access only one memory region: [0x80200000, 0x80400000)—it contains the code, data, heap, and stack of the current process (i.e., everything a user application needs).
However, PMP won't take any effect if we still run everything in machine mode, so we need to switch privilege modes when switching the CPU context from the kernel back to a user application process.
Switch privilege modes
You need to understand mstatus.MPP and update mstatus.MPP in proc_yield according to the comments there. Recall that mstatus.MPP stands for bit#11 and bit#12 of mstatus:

You will need to set these bits to 11 if the next scheduled process is a kernel process (i.e., pid < GPID_USER_START), or set them to 00 for all other processes. In RISC-V, 0 stands for user mode, and 3 (i.e., 11 in binary) stands for machine mode. To see how it works, we need to explain what happens when entering and exiting the kernel.
Upon an interrupt or exception, the CPU enters the kernel and automatically switches the privilege mode to machine mode before jumping to the trap_entry handler. This allows the kernel to run in machine mode and freely access the memory.
Upon executing mret in grass/kernel.s, the CPU exits the kernel, and mret will switch the privilege mode according to mstatus.MPP. Therefore, if we set mstatus.MPP to 00 in proc_yield, the next scheduled process will run in user mode after the mret instruction.
Kill malicious applications
To test whether you correctly set the PMP region and switched privilege modes, we have provided 2 malicious applications, crash1 and crash2, in the apps/user directory. The malicious applications would halt the whole operating system by corrupting the memory.
> make qemu
...
[CRITICAL] Choose a memory translation mechanism:
Enter 0: page tables
Enter 1: software TLB
[INFO] Software translation is chosen
...
[CRITICAL] Welcome to the egos-2000 shell!
➜ /home/yunhao crash1
_sbrk: heap grows too large
[FATAL] excp_entry: kernel got exception 7Note that this FATAL happens at the end of function excp_entry. Your final task in P3 is to implement the following part of excp_entry.
static void excp_entry(uint id) {
...
/* Student's code goes here (System Call & Protection | Virtual Memory). */
/* Kill the current process if curr_pid is a user application. */
/* Student's code ends here. */
FATAL("excp_entry: kernel got exception %d", id);
}After excp_entry gracefully kills the malicious applications, you should see the following.
# Make sure to choose software TLB
> make qemu
...
> /home/yunhao crash1
_sbrk: heap grows too large
[INFO] process 6 terminated with exception 7
> /home/yunhao crash2
[INFO] process 7 terminated with exception 7
> /home/yunhaoIn other words, memory protection should work: malicious applications running in user mode trigger memory exceptions when attempting to corrupt memory. The kernel kills these malicious applications when handling such exceptions.
Accomplishments
In terms of OS concepts, you have learned about exception handling, system calls, privilege modes, and inter-process communication. In terms of code reading, you have completed all the code in grass and library/syscall. The grass layer is the kernel in egos-2000.
You will read earth/cpu_mmu.c and library/elf/* in P4, read earth/dev_disk.c and earth/dev_tty.c in P5, and read library/file/* in P6. Then you will essentially have read all the code for egos-2000. We are halfway there!