-
Notifications
You must be signed in to change notification settings - Fork 105
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handle signals properly #529
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Benchmarks
Benchmark suite | Current: 8e08387 | Previous: f554a00 | Ratio |
---|---|---|---|
Dhrystone |
1334 Average DMIPS over 10 runs |
1287 Average DMIPS over 10 runs |
0.96 |
Coremark |
970.171 Average iterations/sec over 10 runs |
962.761 Average iterations/sec over 10 runs |
0.99 |
This comment was automatically generated by workflow using github-action-benchmark.
93091f5
to
b952f4b
Compare
6a5733b
to
b4c467f
Compare
c9510ce
to
6542e92
Compare
src/system.c
Outdated
} \ | ||
} while (0) | ||
|
||
#define CLEAR_PENDING_SIGNAL_W() \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we just introduce these CLEAR_PENDING_SIGNAL_*
macros for repeated code? I think the readability is bad to return
in a macro that is used as a function-like call routine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternatively, I think we can remain return
in CLEAR_PENDING_SIGNAL_*
and mention they might early return using a comment at the call point of CLEAR_PENDING_SIGNAL_*
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any disadvantage to removing these macros and just putting them inline? It can be just a few lines.
if (need_handle_signal = (rv->csr_sepc != rv->last_csr_sepc))
return 0;
...
if (need_handle_signal = (rv->csr_sepc != rv->last_csr_sepc))
return;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any disadvantage to removing these macros and just putting them inline? It can be just a few lines.
if (need_handle_signal = (rv->csr_sepc != rv->last_csr_sepc)) return 0; ... if (need_handle_signal = (rv->csr_sepc != rv->last_csr_sepc)) return;
Looking these few LoC is hard to understand (maybe for newcomer), thus abstracting them into a macro and comment around the macro make things clearer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about inline this directly?
need_handle_signal = (rv->csr_sepc != rv->last_csr_sepc);
if (need_handle_signal)
return 0;
You can replace need_handle_signal = (rv->csr_sepc != rv->last_csr_sepc)
with an abstract if you think it is needed. I am just thinking that we should avoid return
in a function-like macro as possible because it is not clear that the function will return there at first glance.
Should we stack the exception handling (i.e. nested exception handling)? By default, RISC-V does not accept any incoming interrupt in the interrupt service routine (ISR) by disabling |
Do you mean nested signals or others since the changes in this commit is signals?
Yes.
If you mean regular exceptions, we did store current |
Yes. There might be more than two-level nested signals. The oldest |
The signal is a userspace stack frame (refer to the commit message and the flowchart) that is created by the kernel. Thus, when multiple signals arrive, the kernel simply creates multiple signal frames and redirects the PC to handle the signal at a safe point. The return address of old signal frame will be saved in the |
8ee7304
to
cb5eb10
Compare
src/system.c
Outdated
* cannot always redo. For example, invalid memory access causes SIGSEGV. | ||
*/ | ||
extern bool need_handle_signal; | ||
#define CHECK_PENDING_SIGNAL() \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Specify rv
parameter in this function-like macro. That is,
#define CHECK_PENDING_SIGNAL(rv) \
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of using rv
, it may be more appropriate to use need_handle_signal
as the parameter of the macro. This is because rv
has only a single instance, and changes to its members are part of the macro's internal logic.
On the other hand, need_handle_signal
is explicitly modified by the macro and directly determines whether the function shall return. This approach improves intuitiveness and readability. Consider the following revised version:
#define CHECK_PENDING_SIGNAL(flag) \
do { \
flag = (rv->csr_sepc != rv->last_csr_sepc); \
} while (0);
Page faults trigger a trap, which is handled by do_page_fault(). This function calls lock_mm_and_find_vma() to locate and validate the virtual memory area (VMA), returning the VMA if valid, or NULL otherwise. Typically, attempts to read or write to a NULL VMA result in a NULL return. If the VMA is invalid, bad_area_nosemaphore() is invoked, which checks whether the fault originated in kernel or user space. For user-space faults, a SIGSEGV signal is sent to the user process via do_trap(), which determines if the signal should be ignored or blocked, and if not, adds it to the task's pending signal list. Kernel-space faults cause the kernel to crash via die_kernel_fault(). Before returning to user space (via the resume_userspace label), pending work (indicated by the _TIF_WORK_MASK mask) is processed by do_work_pending(). Signals are handled by do_signal(), which in turn calls handle_signal(). handle_signal() creates a signal handler frame that will be jumped to upon returning to user space. This frame creation process might modifies the Control and Status Register (CSR) SEPC. If there are a signal pending, the SEPC CSR overwritten the original trap/fault PC. This caused an assertion failure in get_ppn_and_offset() when running the vi program, reported in [1]. To address this, a variable last_csr_sepc was introduced to store the original SEPC CSR value before entering the trap path. After returning to user space, last_csr_sepc is compared with the current SEPC CSR value. If they differ, the fault ld/st instruction returns early and jumps to the signal handler frame. This commit prevents emulator crashes when the guest OS accesses invalid memory. Consequently, reads or writes to a NULL value now correctly result in a segmentation fault. In addition, two user-space programs: mem_null_read and mem_null_write are bundled into the rootfs for verification. Original behaviour: 1. $ make system ENABLE_SYSTEM=1 -j$(nproc) # run system emulation 2. $ mem_null_read # Emulator crashes 3. $ mem_null_write # Emulator crashes 4. $ vi # Emulator crashes Patch Reproduce / Testing procedure: 1. $ make system ENABLE_SYSTEM=1 -j$(nproc) # run system emulation 2. $ mem_null_read # NULL read causes SIGSEGV with no crashing 3. $ mem_null_write # NULL write causes SIGSEGV with no crashing 4. $ vi # w/o filename causes SIGSEGV with no crashing [1] sysprog21#508
cb5eb10
to
8e08387
Compare
Thank @ChinYikMing for contributing! |
The above flowchart is the simplified version of handling page faults.
Page faults trigger a trap, which is handled by
do_page_fault()
. This function callslock_mm_and_find_vma()
to locate and validate the virtual memory area (VMA), returning the VMA if valid, or NULL otherwise. Typically, attempts to read or write to a NULL VMA result in a NULL return. If the VMA is invalid,bad_area_nosemaphore()
is invoked, which checks whether the fault originated in kernel or user space.For user-space faults, a SIGSEGV signal is sent to the user process via
do_trap()
, which determines if the signal should be ignored or blocked, and if not, adds it to the task's pending signal list. Kernel-space faults cause the kernel to crash viadie_kernel_fault()
.Before returning to user space (via the resume_userspace label), pending work (indicated by the _TIF_WORK_MASK mask) is processed by
do_work_pending()
. Signals are handled bydo_signal()
, which in turn callshandle_signal()
.handle_signal()
creates a signal handler frame that will be jumped to upon returning to user space. This frame creation process might modifies the Control and Status Register (CSR) SEPC. If there are a signal pending, the SEPC CSR overwritten the original trap/fault PC. This caused an assertion failure inget_ppn_and_offset()
when running the vi program, reported in [1].To address this, a variable last_csr_sepc was introduced to store the original SEPC CSR value before entering the trap path. After returning to user space, last_csr_sepc is compared with the current SEPC CSR value. If they differ, the fault ld/st instruction returns early and jumps to the signal handler frame.
This commit prevents emulator crashes when the guest OS accesses invalid memory. Consequently, reads or writes to a NULL value now correctly result in a segmentation fault. In addition, two user-space programs: mem_null_read and mem_null_write are bundled into the rootfs for verification.
mem_null_read and mem_null_write source codes:
mem_null_read
mem_null_write
Original behaviour
Patch Reproduce / Testing procedure:
[1] #508
vi Crach Discussion
Running vi without a filename results in a NULL pointer access (behavior that is implementation-defined), correctly triggering a segmentation fault. Dmesg confirms that the faulting address is indeed NULL. However, running vi without a filename in RVVM and semu does not produce a SIGSEGV fault. My investigation revealed that semu simply increments the program counter (PC) by 4, effectively skipping the faulting instruction. I'm unsure if this is a valid approach. What I know is that read/write access to NULL should generate a non-maskable SIGSEGV signal, consistent with standard Unix signal.
Dmesg after running vi (getting SIGSEGV instead of crashing w/ the patch):
[ 10.374478] Unable to handle kernel NULL pointer dereference at virtual address 00000000 [ 10.374812] Oops [#2] [ 10.374962] Modules linked in: [ 10.375152] CPU: 0 PID: 61 Comm: vi Tainted: G D 6.1.119 #1 [ 10.375504] Hardware name: rv32emu (DT) [ 10.375696] epc : strncpy_from_user+0x6c/0x190 [ 10.375986] ra : getname_flags+0x74/0x194 [ 10.376269] epc : c01bd3a4 ra : c00ba71c sp : c0989ea0 [ 10.376553] gp : c0476320 tp : c0aa0580 t0 : 00000ff0 [ 10.376831] t1 : fefefeff t2 : 691a2420 s0 : c0989eb0 [ 10.377115] s1 : c0882000 a0 : 00000000 a1 : 00000000 [ 10.377377] a2 : 00000ff0 a3 : 00000000 a4 : 00000000 [ 10.377632] a5 : 00000ff0 a6 : 0000146d a7 : c0882010 [ 10.377904] s2 : c0989f38 s3 : 00000000 s4 : 00000000 [ 10.378167] s5 : c047752c s6 : 00000000 s7 : 00000000 [ 10.378429] s8 : 00001000 s9 : 00000002 s10: 00000014 [ 10.378688] s11: ffffffff t3 : 80808080 t4 : 00040000 [ 10.378968] t5 : 00000005 t6 : 00000ff0 [ 10.379174] status: 00040120 badaddr: 00000000 cause: 0000000d [ 10.379455] [<c01bd3a4>] strncpy_from_user+0x6c/0x190 [ 10.379790] [<c00ba71c>] getname_flags+0x74/0x194 [ 10.380117] [<c00ba88c>] getname+0x1c/0x2c [ 10.380421] [<c00a9f7c>] do_sys_openat2+0x4c/0xf0 [ 10.380776] [<c00aa11c>] do_sys_open+0x40/0x58 [ 10.381122] [<c00aa194>] sys_openat+0x24/0x34 [ 10.381467] [<c00023c0>] ret_from_syscall+0x0/0x4 [ 10.381788] ---[ end trace 0000000000000000 ]--- Segmentation fault #