Skip to content

Hello, World! ​

This project helps you understand the following C program.

c
int main() {
    printf("Hello, World!");
    return 0;
}

Specifically, you will learn about 3 types of CPU instructions (integer computation, control transfer, memory access) and 4 types of memory regions (code, data, heap, stack), which are fundamental to every computer program.

Prerequisites: an entry-level systems course equivalent to Cornell CS3410 or CMU 15-213 is typically required before learning OS. This project aims at revisiting the concepts of CPU instructions and memory regions instead of taking the place of an entry-level course.

Get started ​

Follow the instructions of step1 and step2 in USAGES.md. After you can compile egos-2000 and run it on QEMU, switch to the hello branch by git checkout hello. Run this branch by make qemu and then you will see

-------- Compile Hello, World! --------
riscv-none-elf-gcc -march=rv32ima_zicsr -mabi=ilp32 -Wl,--gc-sections -ffunction-sections -fdata-sections -fdiagnostics-show-option hello.s hello.c -Thello.lds -nostdlib -lc -lgcc -o hello.elf
riscv-none-elf-objdump --source --all-headers --demangle --line-numbers --wide hello.elf > hello.lst
riscv-none-elf-objcopy -O binary hello.elf hello.bin
-------- Run Hello-World on QEMU --------
qemu-system-riscv32 -nographic -machine virt -smp 1 -bios hello.bin
Hello, World!

The code in this branch simply prints Hello, World! to the screen and then stalls. You can exit QEMU by typing ctrl+a and then x, after which you will see QEMU: Terminated.

Next, we explain the code in this hello branch which gives concrete examples of CPU instructions (integer computation, control transfer, memory access) and memory regions (code, data, heap, stack). After the explanation, you will start from this code and implement the printf function as an exercise.

Instructions and memory ​

Before main() ​

Let's start with the first two lines of hello.lds.

lds
OUTPUT_ARCH("riscv")
ENTRY(_start)

The first line says that the output architecture for this computer program is RISC-V. This book will focus on RISC-V since it is widely used for education, but there also exist other influential CPU architectures such as ARM and Intel x86. The second line says that the entry point of this program is _start, defined in hello.s. We will move our attention to hello.s and ignore the rest of hello.lds for now.

According to _start in hello.s, the first two CPU instructions in this program are

asm
li sp, 0x80400000
call main

The li instruction loads the 4-byte integer 0x80400000 to the stack pointer register sp. li is an example of integer computation instructions because it modifies a register with an integer. Other examples of integer computation include addition, multiplication, division, bitwise shift, bitwise AND/OR/XOR operations, etc.

After setting sp, the call instruction transfers the control to main() in hello.c. The call instruction is an example of control transfer instructions, and it does two things: (1) point the program counter to the first instruction of main(); (2) set the ra register to the address of call _end in hello.s, which is the instruction next to call main. Intuitively, sp specifies the memory that main() can use on the stack and ra specifies the return address, i.e., the program counter after main() returns. We make this intuition concrete by reading the code of main().

In main() ​

The C code of main() in hello.c is simple:

c
int main() {
    char* msg = "Hello, World!\n\r";
    terminal_write(msg, 15);

    return 0;
}

After make, you can find the CPU instructions for main() in hello.lst:

lst
80000078 <main>:
main():
80000078:   fe010113            addi    sp,sp,-32
8000007c:   00112e23            sw  ra,28(sp)
80000080:   00812c23            sw  s0,24(sp)
80000084:   02010413            addi    s0,sp,32
80000088:   800007b7            lui a5,0x80000
8000008c:   0b878793            addi    a5,a5,184 # 800000b8 <main+0x40>
80000090:   fef42623            sw  a5,-20(s0)
80000094:   00f00593            li  a1,15
80000098:   fec42503            lw  a0,-20(s0)
8000009c:   f75ff0ef            jal 80000010 <terminal_write>
800000a0:   00000793            li  a5,0
800000a4:   00078513            mv  a0,a5
800000a8:   01c12083            lw  ra,28(sp)
800000ac:   01812403            lw  s0,24(sp)
800000b0:   02010113            addi    sp,sp,32
800000b4:   00008067            ret

There are 16 instructions for main starting at memory address 0x80000078 and each of them occupies 4 bytes of memory. Therefore, the call main instruction in hello.s sets the program counter to 0x80000078 which points to an addi instruction. The binary form of this addi instruction is 0xfe010113 and such binary encoding is explained in the RISC-V Instruction Set Manual. For our purpose, we can ignore the binary encodings and focus on the instructions themselves. Let's start by reading the 7 highlighted instructions.

Common structure ​

The two addi instructions at address 0x80000078 and 0x800000b0 add integers -32 and 32 to the sp register, which indicates that the 32-byte memory region [0x80400000 - 32, 0x80400000) is used for main(). The ret at the end will cause a control transfer to the instruction pointed by ra, i.e., the call _end in hello.s as we explained earlier. Such usage of sp and ra is the common structure of every function in C.

Note that ra should still hold the address of call _end when executing ret, which is ensured by these two instructions:

lst
8000007c:   00112e23            sw  ra,28(sp)
...
800000a8:   01c12083            lw  ra,28(sp)

28(sp) means adding 28 to sp as a memory address. Since sp equals to 0x80400000 - 32, the sw instruction stores the 4 bytes of ra into memory region [0x80400000 - 4, 0x80400000) and the lw loads 4 bytes from the region back to ra. As long as this 4-byte region is unmodified, register ra can be freely modified during the execution of main().

These are examples of memory access instructions and another example is lb/sb which loads/stores one byte from/to the memory while lw/sw loads/stores a word of 4 bytes.

TIP

By this point, we have seen examples of all the 3 types of CPU instructions (integer computation, control transfer, memory access) which constitute the user-level instructions of RISC-V. P0 and P1 will only use user-level instructions. Starting from P2, you will see more instructions and registers that support an operating system.

First line of main() ​

c
    char* msg = "Hello, World!\n\r";

is compiled to the following instructions.

lst
80000084:   02010413            addi    s0,sp,32
80000088:   800007b7            lui a5,0x80000
8000008c:   0b878793            addi    a5,a5,184 # 800000b8 <main+0x40>
80000090:   fef42623            sw  a5,-20(s0)

lui stands for load-upper-immediate which means that the upper 20 bits of a5 are set to 0x80000 while the lower 12 bits are set to 0. Therefore, these 4 instructions (1) set s0 to 0x80400000; (2) set a5 to 0x800000b8; (3) store the 4-byte of 0x800000c8 at memory address 0x80400000 - 20. This means two things:

  1. The compiler puts the "Hello, World!\n\r" string at memory address 0x800000b8.

  2. The compiler decides that msg, as a local variable of main(), occupies the 4 bytes of memory starting at 0x80400000 - 20 in the 32-byte main() stack.

TIP

This is a good point to revise the concept of pointers. In C, a char is simply one byte storing the ASCII encoding of a character. For example, the ASCII encoding for "H" and "e" are 72 and 101 respectively so that memory address 0x800000b8 holds 72 and 0x800000b9 holds 101. The type of msg, namely char*, is called char pointer, meaning that variable msg holds the memory address of a char. This book targets 32-bit CPUs so that every memory address is represented by 4 bytes (i.e., all pointers hold 4 bytes). One can dereference the msg pointer with syntax *msg, e.g., *msg == 'H' and *(msg + 1) == 'e'.

Second and third line ​

c
    terminal_write(msg, 15);
    return 0;

are compiled to the following instructions.

lst
80000094:   00f00593            li  a1,15
80000098:   fec42503            lw  a0,-20(s0)
8000009c:   f75ff0ef            jal 80000010 <terminal_write>
800000a0:   00000793            li  a5,0
800000a4:   00078513            mv  a0,a5

In short, registers a0 and a1 hold the first two arguments when calling a function, and a0 further holds the return value. Therefore, a1 is set to 15 and a0 is set to 0x800000b8 by loading 4 bytes from address 0x80400000 - 20. jal is an alias of call and the first instruction of terminal_write is at address 0x80000010. After terminal_write returns, a0 is set to 0 as the return value of main(). It is possible to do li a0,0 directly without touching a5 if the compiler did more optimizations.

In terminal_write() ​

For terminal_write(), we only explain the C code and you could read all its instructions in hello.lst as an exercise.

c
void terminal_write(const char *str, int len) {
    for (int i = 0; i < len; i++) {
        *(char*)(0x10000000UL) = str[i];
    }
}

At this point, it is enough to know that writing a byte to the special address 0x10000000 will print a character on the screen. P5 will introduce the architectural support for device I/O.

*(char*)(0x10000000UL) converts 0x10000000 into a char pointer and then dereferences the pointer, so it stands for the byte at memory address 0x10010000.

Code, data and stack ​

We have examined the two functions in this program. We now review the memory regions touched by the CPU instructions. Let's start with the following part of hello.lst:

lst
Sections:
Idx Name              Size      VMA       LMA       File off  Algn  Flags
  0 .init             0000000c  80000000  80000000  00001000  2**3  CONTENTS, ALLOC, LOAD, READONLY, CODE
  1 .text             000000a8  80000010  80000010  00001010  2**3  CONTENTS, ALLOC, LOAD, READONLY, CODE
  2 .rodata           00000010  800000b8  800000b8  000010b8  2**3  CONTENTS, ALLOC, LOAD, READONLY, DATA

A section is simply a memory region specified by a starting address and its size.

The .init section starts at 0x80000000 and holds the three instructions in hello.s. By default, 0x80000000 is used to initialize the program counter in QEMU. The .text section holds instructions for terminal_write() and main() starting at 0x80000010 and ending at 0x800000b8. Together, .init and .text form the code region of this program.

The 16 bytes starting at 0x800000b8 form the data region of this program. The data region would typically contain 3 sections:

  1. .rodata holds read-only data such as the "Hello, World!\n\r" string.
  2. .data holds global variables with non-zero initial values.
  3. .bss holds global variables with zero as their initial value.

This simple program does not have any global variables, so we only see .rodata. The end of .rodata is 0x800000c8, which is the start of the heap region. You will learn about heap later in the dynamic memory allocation section.

Lastly, the stack region starts at 0x80400000 and grows down. Specifically, the stack for main is [0x80400000 - 32, 0x80400000) and the stack for terminal_write is the 48-byte region [0x80400000 - 32 - 48, 0x80400000 - 32). The choice of 0x80400000 is not mandatory and it is possible to set sp with a different value in hello.s.

Formatted output ​

The terminal_write function takes a string and prints it on the screen, but we may further wish to print integers, pointers in hexadecimal format, etc. We thus introduce the printf function which converts a format into a string and prints it on the screen.

c
printf("%d is a number", 1234);
printf("%s is a string", "abcd");
printf("%s-%d is awesome!", "egos", 2000);

These examples show that printf takes a string as the first argument which has special meanings: %d and %s mean placeholders for integers and strings. For example, the last printf would print "egos-2000 is awesome!" according to the 3 arguments.

Uncomment the printf from the main function of hello.c, and uncomment functions format_to_str and printf. Do make qemu again and you will see:

Hello, World!
egos-2000 is awesome!

Let's start by reading the first 3 lines of this code block:

c
#include <string.h>  // for strlen() and strcat()
#include <stdlib.h>  // for itoa()
#include <stdarg.h>  // for va_start(), va_end() and va_arg()

Intuitively, the code is including part of the C standard library. For example, strlen takes a string and returns its length, and the code of strlen is provided by the string.h library in the C compiler, so you don't need to write it yourself. Similarly, strcat in the string.h library concatenates two strings into a single one. And the itoa in stdlib.h converts an integer to a string in ASCII encoding (i.e., integer-to-ASCII).

Lastly, stdarg.h provides va_start, va_end and va_arg, where va means variable argument (i.e., the argument list of printf does not have a fixed length so that different invocations of printf can take different numbers of arguments). In format_to_str, the va_arg(args, int) reads an int-type argument from the list and the itoa converts the integer to a string, appending to the end of the out string.

TIP

Please do read the assembly code of format_to_str and printf in hello.lst. You can skip library functions like strlen or strcat, and focus on instructions within these two functions. You will get a precise understanding by connecting the C code with the CPU instructions, which verbal explanations cannot give to you. For example, you will see how va_arg is compiled to a few load and store instructions without doing any function calls.

After gaining a precise understanding of format_to_str and printf, you need to modify format_to_str so that the code below works correctly.

c
printf("%c is character $", '$');
printf("%c is character 0", (char)48);
printf("%x is integer 1234 in hexadecimal", 1234);
printf("%u is the maximum of unsigned int", (unsigned int)0xFFFFFFFF);
printf("%p is the hexadecimal address of the hello-world string", msg);
printf("%lu is the maximum of unsigned long long", 0xFFFFFFFFFFFFFFFFUL);

Dynamic memory allocation ​

The char buf[512] in printf allocates 512 bytes on the stack of printf to hold the output string generated by format_to_str. However, the output string can be longer than 512 bytes and, during compile time, the length of the output string is usually unknown. This leads to the need of dynamic memory allocation:

c
int printf(const char* format, ...) {
    /* TODO: calculate the length of the output string as buf_len */
    char *buf = malloc(buf_len);
    ...
    terminal_write(buf, strlen(buf));
    free(buf);
    /* the memory region returned by malloc() is no longer needed */
    return 0;
}

In this last part of P0, you need to implement the TODO above which initializes an integer variable buf_len as the argument of malloc, a function provided by stdlib.h. After malloc, the buf pointer holds an address such that the buf_len bytes starting at this address can be freely used. After invoking free, this region cannot be used any more.

Heap ​

Before explaining more about malloc and free, we explain the heap memory region with the code below which you need to uncomment from hello.c.

c
extern char __heap_start, __heap_max;
static char* brk = &__heap_start;
char* _sbrk(int size) {
    if (brk + size > (char*)&__heap_max) {
        terminal_write("_sbrk: heap grows too large\r\n", 29);
        return NULL;
    }

    char* old_brk = brk;
    brk += size;
    return old_brk;
}

__heap_start and __heap_max are defined in hello.lds: __heap_start is the end of the data region (i.e., the end of the .bss section) while __heap_max is hard-coded to 0x80200000, just like the initial stack pointer is hard-coded to 0x80400000.

The purpose here is to ensure that the heap and stack do not overlap when the heap grows up and the stack grows down, which spawns the concept of break pointer (i.e., the brk variable above pointing to the end of the heap). As long as the break pointer is lower than the stack pointer, the stack and the heap do not overlap. We simply assume that the stack pointer sp is never lower than 0x80200000 (i.e., stack size is never larger than 2MB) and the brk variable is never greater than 0x80200000.

Malloc and free ​

The initial heap size is 0 since brk is initialized as __heap_start and the heap region is defined as [brk, __heap_start). Say malloc(512) is the first time of invoking malloc. Three things will happen within malloc:

  1. Invoke _sbrk with an argument greater than 512, say 4096 (i.e., 4KB);
  2. With the return value of _sbrk, initialize some data structures in the 4KB region;
  3. With the data structures, find 512 bytes in this heap region and return its address.

Say malloc(32) is invoked next. It is likely that malloc can find 32 bytes using the data structures without having to invoke _sbrk again and increase the heap. Upon free, the data structures are updated so that the freed memory could be reused by a later malloc. For example, multiple invocations of printf may use the same memory region to hold the output string (i.e., malloc returns the same pointer multiple times).

The data structure operations for malloc and free are implemented in the stdlib.h library from the C compiler so that you don't have to know the details, just like you don't have to know the details of itoa or strcat. However, you are more than welcome to design your own data structures and write your own malloc and free functions.

Accomplishments ​

You have finished reading library/libc in egos-2000. You also get the intuition behind library/elf which initializes the 4 memory regions for a program, given the executable file of this program in the ELF format.

"... any person ... any study."