Handle signals properly #529

ChinYikMing · 2024-12-26T21:14:50Z

flowchart TB
    X[page faults] --> trap --> do_page_fault --> lock_mm_and_find_vma --> find_vma
    find_vma -- valid VMA --> A[Map the VMA to page table and resume the userspace]
    find_vma -- NULL --> bad_area_nosemaphore
    bad_area_nosemaphore -- from kernel space --> die_kernel_fault
    bad_area_nosemaphore -- from user space --> do_trap --> B[Checks whether the signal should be ignored or blocked, and if not, adds it to the task's pending signal list.]
                         
    ret_from_exception --> resume_userspace_slow --> do_work_pending -- determined by _TIF_WORK_MASK --> do_signal --> handle_signal --> setup_rt_frame --> C[Update SEPC CSR if needed] --> resume_userspace

The above flowchart is the simplified version of handling page faults.

Page faults trigger a trap, which is handled by do_page_fault(). This function calls lock_mm_and_find_vma() to locate and validate the virtual memory area (VMA), returning the VMA if valid, or NULL otherwise. Typically, attempts to read or write to a NULL VMA result in a NULL return. If the VMA is invalid, bad_area_nosemaphore() is invoked, which checks whether the fault originated in kernel or user space.

For user-space faults, a SIGSEGV signal is sent to the user process via do_trap(), which determines if the signal should be ignored or blocked, and if not, adds it to the task's pending signal list. Kernel-space faults cause the kernel to crash via die_kernel_fault().

Before returning to user space (via the resume_userspace label), pending work (indicated by the _TIF_WORK_MASK mask) is processed by do_work_pending(). Signals are handled by do_signal(), which in turn calls handle_signal(). handle_signal() creates a signal handler frame that will be jumped to upon returning to user space. This frame creation process might modifies the Control and Status Register (CSR) SEPC. If there are a signal pending, the SEPC CSR overwritten the original trap/fault PC. This caused an assertion failure in get_ppn_and_offset() when running the vi program, reported in [1].

To address this, a variable last_csr_sepc was introduced to store the original SEPC CSR value before entering the trap path. After returning to user space, last_csr_sepc is compared with the current SEPC CSR value. If they differ, the fault ld/st instruction returns early and jumps to the signal handler frame.

This commit prevents emulator crashes when the guest OS accesses invalid memory. Consequently, reads or writes to a NULL value now correctly result in a segmentation fault. In addition, two user-space programs: mem_null_read and mem_null_write are bundled into the rootfs for verification.

mem_null_read and mem_null_write source codes:

mem_null_read

#include <stdio.h>

int main(){
	char *ptr = NULL;
	printf("*ptr: %c\n", *ptr);
	return 0;
}

mem_null_write

#include <stdio.h>

int main(){
        char *ptr = NULL;
        *ptr = 'a';

        return 0;
}

Original behaviour

Run system emulation

$ make system ENABLE_SYSTEM=1 -j$(nproc)

Emulator crashes

$ mem_null_read

Emulator crashes

$ mem_null_write

Emulator crashes

$ vi

Patch Reproduce / Testing procedure:

Run system emulation

$ make system ENABLE_SYSTEM=1 -j$(nproc)

NULL read causes SIGSEGV without crashing emulator

$ mem_null_read

NULL write causes SIGSEGV without crashing emulator

$ mem_null_write

w/o filename causes SIGSEGV without crashing emulator

$ vi

[1] #508

vi Crach Discussion

Running vi without a filename results in a NULL pointer access (behavior that is implementation-defined), correctly triggering a segmentation fault. Dmesg confirms that the faulting address is indeed NULL. However, running vi without a filename in RVVM and semu does not produce a SIGSEGV fault. My investigation revealed that semu simply increments the program counter (PC) by 4, effectively skipping the faulting instruction. I'm unsure if this is a valid approach. What I know is that read/write access to NULL should generate a non-maskable SIGSEGV signal, consistent with standard Unix signal.

Dmesg after running vi (getting SIGSEGV instead of crashing w/ the patch):

[   10.374478] Unable to handle kernel NULL pointer dereference at virtual address 00000000
[   10.374812] Oops [#2]
[   10.374962] Modules linked in:
[   10.375152] CPU: 0 PID: 61 Comm: vi Tainted: G      D            6.1.119 #1
[   10.375504] Hardware name: rv32emu (DT)
[   10.375696] epc : strncpy_from_user+0x6c/0x190
[   10.375986]  ra : getname_flags+0x74/0x194
[   10.376269] epc : c01bd3a4 ra : c00ba71c sp : c0989ea0
[   10.376553]  gp : c0476320 tp : c0aa0580 t0 : 00000ff0
[   10.376831]  t1 : fefefeff t2 : 691a2420 s0 : c0989eb0
[   10.377115]  s1 : c0882000 a0 : 00000000 a1 : 00000000
[   10.377377]  a2 : 00000ff0 a3 : 00000000 a4 : 00000000
[   10.377632]  a5 : 00000ff0 a6 : 0000146d a7 : c0882010
[   10.377904]  s2 : c0989f38 s3 : 00000000 s4 : 00000000
[   10.378167]  s5 : c047752c s6 : 00000000 s7 : 00000000
[   10.378429]  s8 : 00001000 s9 : 00000002 s10: 00000014
[   10.378688]  s11: ffffffff t3 : 80808080 t4 : 00040000
[   10.378968]  t5 : 00000005 t6 : 00000ff0
[   10.379174] status: 00040120 badaddr: 00000000 cause: 0000000d
[   10.379455] [<c01bd3a4>] strncpy_from_user+0x6c/0x190
[   10.379790] [<c00ba71c>] getname_flags+0x74/0x194
[   10.380117] [<c00ba88c>] getname+0x1c/0x2c
[   10.380421] [<c00a9f7c>] do_sys_openat2+0x4c/0xf0
[   10.380776] [<c00aa11c>] do_sys_open+0x40/0x58
[   10.381122] [<c00aa194>] sys_openat+0x24/0x34
[   10.381467] [<c00023c0>] ret_from_syscall+0x0/0x4
[   10.381788] ---[ end trace 0000000000000000 ]---
Segmentation fault
#

jserv

Benchmarks

Benchmark suite	Current: `8e08387`	Previous: `f554a00`	Ratio
`Dhrystone`	`1334` Average DMIPS over 10 runs	`1287` Average DMIPS over 10 runs	`0.96`
`Coremark`	`970.171` Average iterations/sec over 10 runs	`962.761` Average iterations/sec over 10 runs	`0.99`

This comment was automatically generated by workflow using github-action-benchmark.

src/emulate.c

src/system.c

RinHizakura · 2024-12-27T16:09:55Z

src/system.c

+        }                                                           \
+    } while (0)
+
+#define CLEAR_PENDING_SIGNAL_W()                                  \


Do we just introduce these CLEAR_PENDING_SIGNAL_* macros for repeated code? I think the readability is bad to return in a macro that is used as a function-like call routine.

Alternatively, I think we can remain return in CLEAR_PENDING_SIGNAL_* and mention they might early return using a comment at the call point of CLEAR_PENDING_SIGNAL_*?

Is there any disadvantage to removing these macros and just putting them inline? It can be just a few lines.

if (need_handle_signal = (rv->csr_sepc != rv->last_csr_sepc)) return 0; ... if (need_handle_signal = (rv->csr_sepc != rv->last_csr_sepc)) return;

Is there any disadvantage to removing these macros and just putting them inline? It can be just a few lines.

if (need_handle_signal = (rv->csr_sepc != rv->last_csr_sepc)) return 0; ... if (need_handle_signal = (rv->csr_sepc != rv->last_csr_sepc)) return;

Looking these few LoC is hard to understand (maybe for newcomer), thus abstracting them into a macro and comment around the macro make things clearer.

How about inline this directly?

need_handle_signal = (rv->csr_sepc != rv->last_csr_sepc); if (need_handle_signal) return 0;

You can replace need_handle_signal = (rv->csr_sepc != rv->last_csr_sepc) with an abstract if you think it is needed. I am just thinking that we should avoid return in a function-like macro as possible because it is not clear that the function will return there at first glance.

vacantron · 2024-12-27T17:34:40Z

Should we stack the exception handling (i.e. nested exception handling)? By default, RISC-V does not accept any incoming interrupt in the interrupt service routine (ISR) by disabling m/sie, however, there is no same mechanism when handling the exceptions.

ChinYikMing · 2024-12-28T03:19:31Z

Should we stack the exception handling (i.e. nested exception handling)?

Do you mean nested signals or others since the changes in this commit is signals?

By default, RISC-V does not accept any incoming interrupt in the interrupt service routine (ISR) by disabling m/sie

Yes.

however, there is no same mechanism when handling the exceptions.

If you mean regular exceptions, we did store current sie to spie then turning of sie when trap. Check here.

vacantron · 2024-12-28T16:23:08Z

Do you mean nested signals or others since the changes in this commit is signals?

Yes. There might be more than two-level nested signals. The oldest m/sepc might be forgotten if we only store two of them.

ChinYikMing · 2024-12-29T09:08:24Z

Do you mean nested signals or others since the changes in this commit is signals?

Yes. There might be more than two-level nested signals. The oldest m/sepc might be forgotten if we only store two of them.

The signal is a userspace stack frame (refer to the commit message and the flowchart) that is created by the kernel. Thus, when multiple signals arrive, the kernel simply creates multiple signal frames and redirects the PC to handle the signal at a safe point. The return address of old signal frame will be saved in the struct sigcontext and restored via sys_sigreturn or sys_rt_sigreturn.

jserv · 2024-12-30T04:16:15Z

src/system.c

+ * cannot always redo. For example, invalid memory access causes SIGSEGV.
+ */
+extern bool need_handle_signal;
+#define CHECK_PENDING_SIGNAL()                                    \


Specify rv parameter in this function-like macro. That is,

#define CHECK_PENDING_SIGNAL(rv) \ ...

Instead of using rv, it may be more appropriate to use need_handle_signal as the parameter of the macro. This is because rv has only a single instance, and changes to its members are part of the macro's internal logic.

On the other hand, need_handle_signal is explicitly modified by the macro and directly determines whether the function shall return. This approach improves intuitiveness and readability. Consider the following revised version:

#define CHECK_PENDING_SIGNAL(flag) \ do { \ flag = (rv->csr_sepc != rv->last_csr_sepc); \ } while (0);

Page faults trigger a trap, which is handled by do_page_fault(). This function calls lock_mm_and_find_vma() to locate and validate the virtual memory area (VMA), returning the VMA if valid, or NULL otherwise. Typically, attempts to read or write to a NULL VMA result in a NULL return. If the VMA is invalid, bad_area_nosemaphore() is invoked, which checks whether the fault originated in kernel or user space. For user-space faults, a SIGSEGV signal is sent to the user process via do_trap(), which determines if the signal should be ignored or blocked, and if not, adds it to the task's pending signal list. Kernel-space faults cause the kernel to crash via die_kernel_fault(). Before returning to user space (via the resume_userspace label), pending work (indicated by the _TIF_WORK_MASK mask) is processed by do_work_pending(). Signals are handled by do_signal(), which in turn calls handle_signal(). handle_signal() creates a signal handler frame that will be jumped to upon returning to user space. This frame creation process might modifies the Control and Status Register (CSR) SEPC. If there are a signal pending, the SEPC CSR overwritten the original trap/fault PC. This caused an assertion failure in get_ppn_and_offset() when running the vi program, reported in [1]. To address this, a variable last_csr_sepc was introduced to store the original SEPC CSR value before entering the trap path. After returning to user space, last_csr_sepc is compared with the current SEPC CSR value. If they differ, the fault ld/st instruction returns early and jumps to the signal handler frame. This commit prevents emulator crashes when the guest OS accesses invalid memory. Consequently, reads or writes to a NULL value now correctly result in a segmentation fault. In addition, two user-space programs: mem_null_read and mem_null_write are bundled into the rootfs for verification. Original behaviour: 1. $ make system ENABLE_SYSTEM=1 -j$(nproc) # run system emulation 2. $ mem_null_read # Emulator crashes 3. $ mem_null_write # Emulator crashes 4. $ vi # Emulator crashes Patch Reproduce / Testing procedure: 1. $ make system ENABLE_SYSTEM=1 -j$(nproc) # run system emulation 2. $ mem_null_read # NULL read causes SIGSEGV with no crashing 3. $ mem_null_write # NULL write causes SIGSEGV with no crashing 4. $ vi # w/o filename causes SIGSEGV with no crashing [1] sysprog21#508

jserv · 2024-12-30T08:25:06Z

Thank @ChinYikMing for contributing!

jserv reviewed Dec 26, 2024

View reviewed changes

ChinYikMing force-pushed the handle-signal branch from 93091f5 to b952f4b Compare December 26, 2024 21:23

jserv requested review from RinHizakura and vacantron December 26, 2024 22:40

jserv reviewed Dec 26, 2024

View reviewed changes

src/emulate.c Show resolved Hide resolved

ChinYikMing force-pushed the handle-signal branch 2 times, most recently from 6a5733b to b4c467f Compare December 27, 2024 03:39

jserv reviewed Dec 27, 2024

View reviewed changes

src/system.c Outdated Show resolved Hide resolved

ChinYikMing force-pushed the handle-signal branch 3 times, most recently from c9510ce to 6542e92 Compare December 27, 2024 04:10

jserv added this to the release-2025.1 milestone Dec 27, 2024

RinHizakura reviewed Dec 27, 2024

View reviewed changes

ChinYikMing force-pushed the handle-signal branch 3 times, most recently from 8ee7304 to cb5eb10 Compare December 30, 2024 02:10

jserv reviewed Dec 30, 2024

View reviewed changes

ChinYikMing force-pushed the handle-signal branch from cb5eb10 to 8e08387 Compare December 30, 2024 08:03

jserv merged commit 5a04f3a into sysprog21:master Dec 30, 2024
8 checks passed

ChinYikMing deleted the handle-signal branch December 30, 2024 09:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Handle signals properly #529

Handle signals properly #529

ChinYikMing commented Dec 26, 2024 •

edited

Loading

jserv left a comment •

edited

Loading

RinHizakura Dec 27, 2024 •

edited

Loading

ChinYikMing Dec 28, 2024

RinHizakura Dec 28, 2024 •

edited

Loading

ChinYikMing Dec 29, 2024

RinHizakura Dec 29, 2024 •

edited

Loading

vacantron commented Dec 27, 2024

ChinYikMing commented Dec 28, 2024 •

edited

Loading

vacantron commented Dec 28, 2024

ChinYikMing commented Dec 29, 2024

jserv Dec 30, 2024

otteryc Dec 30, 2024

jserv commented Dec 30, 2024

Handle signals properly #529

Handle signals properly #529

Conversation

ChinYikMing commented Dec 26, 2024 • edited Loading

mem_null_read and mem_null_write source codes:

Original behaviour

Patch Reproduce / Testing procedure:

vi Crach Discussion

jserv left a comment • edited Loading

Choose a reason for hiding this comment

Benchmarks

RinHizakura Dec 27, 2024 • edited Loading

Choose a reason for hiding this comment

ChinYikMing Dec 28, 2024

Choose a reason for hiding this comment

RinHizakura Dec 28, 2024 • edited Loading

Choose a reason for hiding this comment

ChinYikMing Dec 29, 2024

Choose a reason for hiding this comment

RinHizakura Dec 29, 2024 • edited Loading

Choose a reason for hiding this comment

vacantron commented Dec 27, 2024

ChinYikMing commented Dec 28, 2024 • edited Loading

vacantron commented Dec 28, 2024

ChinYikMing commented Dec 29, 2024

jserv Dec 30, 2024

Choose a reason for hiding this comment

otteryc Dec 30, 2024

Choose a reason for hiding this comment

jserv commented Dec 30, 2024

ChinYikMing commented Dec 26, 2024 •

edited

Loading

jserv left a comment •

edited

Loading

RinHizakura Dec 27, 2024 •

edited

Loading

RinHizakura Dec 28, 2024 •

edited

Loading

RinHizakura Dec 29, 2024 •

edited

Loading

ChinYikMing commented Dec 28, 2024 •

edited

Loading