Skip to content

System Call & Protection ​

By far, you have learned about how an operating system manages the life cycle of threads (or processes, as we don't distinguish the two right now), and schedules multiple threads in a preemptive fashion. This project helps you understand two more things:

  1. how threads invoke system calls in order to communicate with each other
  2. how an operating system protects its memory so that threads cannot corrupt the system calls by modifying the code or data of the operating system

System calls and memory protection rely on exception handling which is very similar to the concept of interrupt handling from P2. We thus start by introducing exception handling.

Exception handling ​

An exception happens if something goes wrong when the CPU executes an instruction. For example, an exception would happen when an instruction tries to access the memory at an invalid address. Instead of ignoring the problem and proceeding to the next instruction, the CPU automatically jumps to a special function called an exception handler, just like how the CPU jumps to the interrupt handler when receiving a timer interrupt. Indeed, we will use the same function to handle both interrupts and exceptions and use the mcause CSR explained below to identify what has caused the CPU to call the handler function.

The mcause CSR ​

Below is a screenshot of Table 22 and Table 23 from this CPU document which describe an important CSR, mcause, designed for exception and interrupt handling. Upon an exception or interrupt, the CPU sets the value of mcause before jumping to the handler function.

Failed to load picture

For example, mcause is set to 0x80000007 when the CPU receives a timer interrupt. Bit#31 of mcause is set to 1 because timer interrupt is an interrupt, and the lower bits are set to 7 because 7 is the code identifying machine timer interrupt. Similarly, if the CPU encounters an illegal instruction (i.e., the 4 bytes pointed by the program counter cannot decode into a CPU instruction), mcause will be set to 0x2 before the CPU jumps to the handler function.

TIP

Exceptions are different from interrupts. Exceptions are triggered by CPU instructions that cause something wrong. Interrupts are triggered by devices outside of the CPU, such as a timer, a disk, or a network interface controller. The similarity is that both need to be handled by the operating system, and the mcause CSR helps the operating system see what needs to be handled.

The ecall exception ​

Most exceptions happen due to something wrong such as invalid memory access. When the code of a thread triggers an exception, the operating system typically terminates this thread and prints out an error message accordingly.

However, RISC-V provides a special instruction, ecall, which will trigger the environment call exception intentionally (i.e., exception #8 or #11 in Table 23). This is the CPU instruction for system calls: when a thread invokes ecall and triggers this exception, the operating system would serve a system call for this thread instead of terminating it.

In egos-2000, this happens in library/syscall/syscall.c. Specifically, asm("ecall") in the code will raise an environment call exception. The CPU then sets the program counter to trap_entry as we have discussed in P2. trap_entry further calls kernel_entry, and kernel_entry calls excp_entry, where system calls are handled within the if statement with condition (id >= EXCP_ID_ECALL_U && id <= EXCP_ID_ECALL_M). EXCP_ID_ECALL_U and EXCP_ID_ECALL_M are defined as 8 and 11 respectively according to Table 23. Take a look at how mcause has been read and used in trap_entry and kernel_entry.

TIP

The U-Mode and M-mode in Table 23 stand for user mode and machine mode. We will touch these privilege modes very soon when we start to explain memory protection.

A sketch of the "kernel" ​

We have been using the word "kernel" since P2, but we never explain what is an OS kernel. With the knowledge of mcause, we are ready to show you a sketch of the "kernel".

c
void kernel() {
    int mcause_val, id;
    asm("csrr %0, mcause" : "=r"(mcause_val));
    id = mcause_val & 0x3FF;
    if (mcause_val & (1 << 31)) {
        if (id == 7) proc_yield();
    } else {
        if (id >= 8 && id <= 11) handle_system_call();
        if (id == 1 || id == 5 || id == 7) handle_memory_access_fault();
    }
}

The code above sketches the core of an operating system:

  • handle thread scheduling upon a timer interrupt
  • handle system calls when a thread invokes ecall
  • handle other exceptions such as memory access faults

We call it a sketch because a complete operating system needs to handle all the interrupts and exceptions, while the 3 bullets above are the most important ones to handle. You have played with thread scheduling in P2, and P3 will give you hands-on experiences about the last two aspects of an OS kernel.

Inter-process communication ​

In egos-2000, there are only 2 types of system calls which are designed for inter-process communication (i.e., sending and receiving messages). We now introduce the system call interface for applications and then explain what happens within the OS kernel.

Application-side interface ​

The code below is from library/syscall/syscall.h and it defines the data structures for system calls in egos-2000.

c
enum syscall_type {
    SYS_UNUSED,
    SYS_RECV,  /* 1 */
    SYS_SEND,  /* 2 */
};

struct syscall {
    enum syscall_type type; /* SYS_SEND or SYS_RECV */
    int sender;             /* sender process ID    */
    int receiver;           /* receiver process ID  */
    char content[SYSCALL_MSG_LEN];
    enum {PENDING, DONE} status;
};

The content field holds the message being sent or received. Say process A wants to send a message to process B through SYS_SEND, this system call may not succeed immediately. It will only succeed after process B invokes the SYS_RECV system call with sender being process A, meaning that process B is ready to receive a message from process A. For this reason, before process B invokes the SYS_RECV system call, the SYS_SEND system call by process A is in status PENDING instead of DONE.

TIP

In other words, egos-2000 implements a blocking version of inter-process communication such that a system call would only return after a message has been successfully sent or received. It is certainly possible to implement a non-blocking version such that system calls return immediately. egos-2000 implements the blocking version just for simplicity of the code.

With struct syscall in mind, the system call interface sys_send and sys_recv defined in library/syscall/syscall.c should be easy to understand.

c
static struct syscall* sc = (struct syscall*)SYSCALL_ARG;

void sys_send(int receiver, char* msg, uint size) {
    sc->type = SYS_SEND;
    sc->receiver = receiver;
    memcpy(sc->content, msg, size);
    asm("ecall");
}

void sys_recv(int from, int* sender, char* buf, uint size) {
    sc->sender = from;
    sc->type = SYS_RECV;
    asm("ecall");
    memcpy(buf, sc->content, size);
    if (sender) *sender = sc->sender;
}

Again, the ecall instructions highlighted above will trigger an environment call exception, so the CPU will jump to the exception handler right after the ecall instruction.

Kernel-side handling ​

As we have seen in P2, the exception handler function in egos-2000, trap_entry, will call kernel_entry which further calls excp_entry for the environment call exception. We now explain the first few lines of excp_entry.

c
if (id >= EXCP_ID_ECALL_U && id <= EXCP_ID_ECALL_M) {
    proc_set[curr_proc_idx].mepc += 4;
    memcpy(&proc_set[curr_proc_idx].syscall, (void*)SYSCALL_ARG, sizeof(struct syscall));
    proc_set[curr_proc_idx].syscall.status = PENDING;
    proc_try_syscall(&proc_set[curr_proc_idx]);
    proc_yield();
    return;
}

First of all, proc_set[curr_proc_idx] is the struct process in the PCB representing the current process which has just invoked the system call.

  • Recall that mepc stands for the program counter when the exception occurs, and the value of mepc is read into proc_set[curr_proc_idx].mepc in function kernel_entry. Therefore, line#2 says that, after the system call is done, the kernel should return to the instruction right after ecall in this process (i.e., skip the 4-byte instruction ecall).

  • Note that the process initialized the struct syscall data structure at memory address SYSCALL_ARG. Line#3 copies this data structure into the PCB and line#4 sets the system call status as PENDING. Line#5 tries to handle the system call and, as we just mentioned, the system call may not succeed immediately. Line#6 finds the next process to schedule just like what you have learned in P2.

  • You can also find proc_try_syscall in proc_yield because the scheduler will attempt a pending system call repeatedly until it succeeds.

Please read proc_try_syscall, proc_try_send and proc_try_recv yourself. There are only 40 lines of code, but they gracefully handle the message passing between processes. On the high-level, if a process has a pending system call, proc_try_syscall will attempt the system call and set the process status as PROC_RUNNABLE or PROC_PENDING_SYSCALL according to whether the attempt succeeds or not.

TIP

At this point, you have finished reading grass/kernel.c, including kernel_entry, intr_entry, excp_entry, proc_yield, proc_try_syscall, proc_try_send and proc_try_recv.

Syscall for process sleep ​

After getting familiar with the system call control flow, we now ask you to use system calls to enable process sleep. As shown in library/syscall.servers.h, the GPID_PROCESS process in egos-2000 accepts 3 message types for spawning and terminating processes. Your job is to add a fourth message type PROC_SLEEP to struct proc_request so that the process who sends this message to GPID_PROCESS will sleep for a certain amount of clock ticks before being scheduled again. Specifically, you will

  1. Start with a fresh copy of egos-2000.

  2. Modify struct process to record how many clock ticks a process needs to wait before it finishes sleeping. Initialize this counter as 0 in proc_alloc.

  3. Modify proc_yield or other parts of the scheduler to update the counters upon a timer interrupt and only schedule threads that no longer need to sleep anymore.

  4. Add a function proc_sleep(pid, nticks) in both grass/kernel.c and struct grass, so applications can invoke grass->proc_sleep and set the counter for pid in the PCB. Note that struct grass is initialized in grass/init.c.

  5. Modify apps/system/sys_process.c to handle the new PROC_SLEEP message by calling grass->proc_sleep with pid being sender.

  6. Add a helper function sleep(nticks) in library/syscall/servers.c which prepares a PROC_SLEEP message containing nticks and sends it to process GPID_PROCESS. Other functions in library/syscall/servers.c do similar things for different system servers.

  7. Lastly, add an application apps/user/sleep.c to test the sleep helper funciton.

c
#include "app.h"

const int nticks = 1000;  /* You may adjust this number. */
int main() {
    printf("Start to sleep for %d ticks\n\r", nticks);
    sleep(nticks);
    printf("Wake up after sleeping for %d ticks\n\r", nticks);
}
  1. You should not encounter a situation where no process can be scheduled. The reason is that GPID_TERMINAL should always be able to run and it never calls sleep.

TIP

At this point, you should have a full picture of system calls from applications to helper functions in library/syscall/servers.c, to the OS kernel, and lastly to the system servers in apps/system.

Protect the OS memory ​

By far, all the code we have seen runs in the so-called machine mode which means that the code can freely access the memory. However, user applications should not be able to freely read/write the memory. Otherwise, a malicious application could corrupt the memory of the kernel and cause damages. On the high-level, we now ask you to do 3 things.

  • specify the memory region that code in the user mode is allowed to read/write
  • run the code of all user applications in the user mode instead of the machine mode
  • terminate a user application if it triggers an exception by trying to read/write an address outside of the memory region allowed for the user mode

You will only touch two privilege modes in P3 and you will learn about a third one called the supervisor mode in P4. In terms of privilege, machine > supervisor > user.

Setup a PMP region ​

Read through chapter 3.6 of the RISC-V reference mannual for Physical Memory Protection (PMP) and then write your code in earth/cpu_mmu.c:

c
void mmu_init() {
    ...
    /* Student's code goes here (PMP memory protection). */

    /* Setup PMP NAPOT region 0x80400000 - 0x80800000 as r/w/x */

    /* Student's code ends here. */
    ...
}

Specifically, code running in the user mode cannot access any memory region by default. After you finish the code in mmu_init, code running in the user mode will be able to read/write one and only one memory region, [0x80400000, 0x80800000), which holds the code, data, heap and stack of the currently running process.

However, PMP won't take any effects if we still run everything in the machine mode, so we need to switch privilege modes when switching the CPU context from the kernel back to a user application.

Switch privilege modes ​

In short, you need to update mstatus.MPP in proc_yield:

c
void proc_yield() {
    ...
    /* Student's code goes here (PMP, page table translation, and multi-core). */

    /* Modify mstatus.MPP to enter machine or user mode during mret
     * depending on whether curr_pid is a grass server or a user app
     */

    /* Student's code ends here. */
    ...
}

Recall that mstatus.MPP stands for bit#11 and bit#12 of the mstatus CSR.

Failed to load picture

You will need to set these bits as 0b11 if the next process scheduled is a kernel process (i.e., pid<GPID_USER_START) and set them as 0b00 for all the other processes. In RISC-V, 0b00 stands for user mode and 0b11 stands for machine mode. To see why it works, let us revise what happens when entering and exiting the kernel.

Upon an interrupt or exception, the CPU enters the kernel and it automatically switches the privilege mode to machine mode right before jumping to the handler function trap_entry. This allows the kernel to run in the machine mode and thus access the memory freely.

Upon executing mret in grass/kernel.s, the CPU exits the kernel and mret will switch the privilege mode according to mstatus.MPP. Therefore, if we set mstatus.MPP to 0b00 in proc_yield, the application code will run in the user mode after executing mret.

Kill malicious applications ​

To test if you correctly set the PMP region and switch privilege modes, we have provided 2 malicious applications crash1 and crash2 under apps/user. The malicious applications would halt the whole operating system by corrupting the memory.

shell
# Make sure to choose software TLB
>  make qemu
...
[CRITICAL] Choose a memory translation mechanism:
Enter 0: page tables
Enter 1: software TLB
[INFO] Software translation is chosen
...
[CRITICAL] Welcome to the egos-2000 shell!
➜ /home/yunhao crash1
_sbrk: heap grows too large
[FATAL] excp_entry: kernel got exception 7

Note that the FATAL happens at the end of function excp_entry. The final coding task in this project is to implement the following part of excp_entry.

c
static void excp_entry(uint id) {
    ...
    /* Student's code goes here (system call and memory exception). */

    /* Kill the process if curr_pid is a user application */

    /* Student's code ends here. */
    FATAL("excp_entry: kernel got exception %d", id);
}

After excp_entry kills the malicious applications gracefully, you should see the following.

shell
# Make sure to choose software TLB
>  make qemu
...
> /home/yunhao crash1
_sbrk: heap grows too large
[INFO] process 6 terminated with exception 7
> /home/yunhao crash2
[INFO] process 7 terminated with exception 7
> /home/yunhao

In other words, memory protection should work and the malicious applications would run in the user mode and trigger memory exceptions when trying to corrupt the memory. And the kernel kills these malicious applications when handling the exceptions.

Accomplishments ​

In terms of OS concepts, you have learned about exception handling, system calls, privilege modes and inter-process communication. In terms of code, you have read everything under grass and library/syscall. The grass directory is the "kernel" part of egos-2000, i.e., the core logic of an operating system.

You will read earth/cpu_mmu.c in P4. You will read everything under library/file in P6. You will read earth/dev_tty.c and earth/dev_disk.c in P5. Then you will finish reading essentially all the code of egos-2000. We are half way there!

"... any person ... any study."