Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

spin_lock_irqsave+sched_lock #14578

Open
wants to merge 5 commits into
base: master
Choose a base branch
from
Open

Conversation

hujun260
Copy link
Contributor

@hujun260 hujun260 commented Oct 31, 2024

Summary

1 Accelerated the implementation of sched_lock, remove enter_critical_section in sched_lock and
only enter_critical_section when task scheduling is required.
2 we add sched_lock_wo_note/sched_unlock_wo_note and it does not perform instrumentation logic
3 We aim to replace big locks with smaller ones. So we will use spin_lock_irqsave extensively to
replace enter_critical_section in the subsequent process. We imitate the implementation of Linux
by adding sched_lock to spin_lock_irqsave in order to address scenarios where sem_post occurs
within spin_lock_irqsave, which can lead to spinlock failures and deadlocks.

The entire implementation process includes:

1 spin_lock_irqsave + sched_lock
2 spin_lock/rw/spin_trylock + sched_lock
3 enter_critical_section + sched_lock
We are currently implementing the first step.

Impact

spinlock and sched_lock

Testing

Build Host:

  • OS: Ubuntu 20.04
  • CPU: x86_64
  • Compiler: GCC 9.4.0

Configuring NuttX and compile:
$ ./tools/configure.sh -l qemu-armv8a:nsh_smp
$ make
Running with qemu
$ qemu-system-aarch64 -cpu cortex-a53 -smp 4 -nographic
-machine virt,virtualization=on,gic-version=3
-net none -chardev stdio,id=con,mux=on -serial chardev:con
-mon chardev=con,mode=readline -kernel ./nuttx

@github-actions github-actions bot added Area: OS Components OS Components issues Size: L The size of the change in this PR is large labels Oct 31, 2024
@github-actions github-actions bot added the Arch: xtensa Issues related to the Xtensa architecture label Oct 31, 2024
include/nuttx/spinlock.h Outdated Show resolved Hide resolved
sched/sched/sched.h Outdated Show resolved Hide resolved
include/sched.h Show resolved Hide resolved
sched/sched/sched.h Outdated Show resolved Hide resolved
sched/sched/sched_lock.c Outdated Show resolved Hide resolved
sched/sched/sched_lock.c Outdated Show resolved Hide resolved
sched/sched/sched_lock.c Outdated Show resolved Hide resolved
sched/sched/sched_unlock.c Outdated Show resolved Hide resolved
sched/sched/sched_unlock.c Show resolved Hide resolved
sched/sched/sched_unlock.c Outdated Show resolved Hide resolved
include/nuttx/spinlock.h Show resolved Hide resolved
@xiaoxiang781216 xiaoxiang781216 linked an issue Nov 18, 2024 that may be closed by this pull request
@xiaoxiang781216
Copy link
Contributor

@patacongo please review this patch which fix the long issue about sched lock.

sched/sched/sched_lock.c Outdated Show resolved Hide resolved
sched/sched/sched_lock.c Outdated Show resolved Hide resolved
sched/sched/sched_lock.c Outdated Show resolved Hide resolved
@hujun260 hujun260 force-pushed the apache_3 branch 2 times, most recently from 87f9b88 to cde599d Compare November 20, 2024 01:38
@github-actions github-actions bot added the Arch: risc-v Issues related to the RISC-V (32-bit or 64-bit) architecture label Nov 20, 2024
@github-actions github-actions bot added the Arch: risc-v Issues related to the RISC-V (32-bit or 64-bit) architecture label Jan 15, 2025
@hujun260 hujun260 force-pushed the apache_3 branch 2 times, most recently from edd4876 to 21ddf45 Compare January 15, 2025 06:17
@@ -41,10 +41,12 @@ SYSCALL_LOOKUP(sched_getparam, 2)
SYSCALL_LOOKUP(sched_getscheduler, 1)
SYSCALL_LOOKUP(sched_lock, 0)
SYSCALL_LOOKUP(sched_lockcount, 0)
SYSCALL_LOOKUP(sched_lock_wo_note, 0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but the implement of lib_get_tempbuffer switch to atomic operation.

sched/sched/sched_lock.c Outdated Show resolved Hide resolved
sched/sched/sched_lock.c Outdated Show resolved Hide resolved
sched/sched/sched_unlock.c Outdated Show resolved Hide resolved
sched/sched/sched_unlock.c Outdated Show resolved Hide resolved
/* This operation is safe because the scheduler is locked and no context
* switch may occur.
*/
if (list_pendingtasks()->head != NULL)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need hold critical section before check the pending task list?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we just need to add a memory synchronization here.
If list_pendingtasks()->head is modified to (not NULL) by another CPU,
but the current CPU doesn't detect it, it's okay. The other core will continue to check list_pendingtasks()->head when sched_unlock is called.
Additionally, we will perform a double-check in nxsched_merge_pending, which is within the critical section.

include/nuttx/spinlock.h Show resolved Hide resolved
include/nuttx/spinlock.h Show resolved Hide resolved
include/nuttx/spinlock.h Show resolved Hide resolved
include/nuttx/spinlock.h Outdated Show resolved Hide resolved
{
/* Note that we no longer have pre-emption disabled. */

#if CONFIG_SCHED_CRITMONITOR_MAXTIME_PREEMPTION >= 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's try best to reduce the change

sched/sched/sched_lock.c Outdated Show resolved Hide resolved
@github-actions github-actions bot removed the Arch: risc-v Issues related to the RISC-V (32-bit or 64-bit) architecture label Jan 19, 2025
@hujun260 hujun260 force-pushed the apache_3 branch 2 times, most recently from 8ab2f7f to fe48a33 Compare January 19, 2025 09:28
include/nuttx/spinlock.h Outdated Show resolved Hide resolved
include/nuttx/spinlock.h Outdated Show resolved Hide resolved
include/nuttx/spinlock.h Outdated Show resolved Hide resolved
include/nuttx/spinlock.h Outdated Show resolved Hide resolved
include/nuttx/spinlock.h Outdated Show resolved Hide resolved
include/nuttx/spinlock.h Outdated Show resolved Hide resolved
include/nuttx/spinlock.h Outdated Show resolved Hide resolved
include/nuttx/spinlock.h Outdated Show resolved Hide resolved
@hujun260 hujun260 force-pushed the apache_3 branch 3 times, most recently from 09a3313 to 2e11d88 Compare January 20, 2025 04:55
@@ -80,7 +80,7 @@
do \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move to sched/irq/irq.h

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done in next patch

SYSCALL_LOOKUP(sched_rr_get_interval, 2)
SYSCALL_LOOKUP(sched_setparam, 2)
SYSCALL_LOOKUP(sched_setscheduler, 3)
SYSCALL_LOOKUP(sched_unlock, 0)
SYSCALL_LOOKUP(sched_unlock_wo_note, 0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does usespace really call sched_unlock_wo_note

{
#if (CONFIG_SCHED_CRITMONITOR_MAXTIME_PREEMPTION >= 0) ||\
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#if (CONFIG_SCHED_CRITMONITOR_MAXTIME_PREEMPTION >= 0) || \


if (rtcb->lockcount == 1)
rtcb = this_task();
if (rtcb != NULL)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why need check != NULL

if (!up_interrupt_context())
{
rtcb = this_task();
if (rtcb != NULL)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't need check?

#ifdef CONFIG_SCHED_INSTRUMENTATION_PREEMPTION
sched_note_preemption(rtcb, true);
#else
sched_lock_wo_note();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove, could share the same path

if (!up_interrupt_context())
{
rtcb = this_task();
if (rtcb != NULL)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove the check


if (rtcb->lockcount <= 0)
tcb = this_task();
if (tcb != NULL)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove the check

rtcb->timeslice == 0)
leave_critical_section_wo_note(flags);
#else
tcb->lockcount--;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's share the same code path

/* Note that we no longer have pre-emption disabled. */
#if (CONFIG_SCHED_CRITMONITOR_MAXTIME_PREEMPTION >= 0) ||\
defined(CONFIG_SCHED_INSTRUMENTATION_PREEMPTION)
irqstate_t flags = enter_critical_section_wo_note();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move after line 240

… with the _wo_note suffix.

Signed-off-by: hujun5 <hujun5@xiaomi.com>
…/irq.h

Signed-off-by: hujun5 <hujun5@xiaomi.com>
@github-actions github-actions bot added Arch: arm Issues related to ARM (32-bit) architecture Arch: arm64 Issues related to ARM64 (64-bit) architecture Arch: risc-v Issues related to the RISC-V (32-bit or 64-bit) architecture Arch: simulator Issues related to the SIMulator Arch: x86_64 Issues related to the x86_64 architecture labels Jan 21, 2025
…ed_[un]lock

reason:
1 Accelerated the implementation of sched_lock, remove enter_critical_section in sched_lock and
only enter_critical_section when task scheduling is required.
2 we add sched_lock_wo_note/sched_unlock_wo_note and it does not perform instrumentation logic

Signed-off-by: hujun5 <hujun5@xiaomi.com>
reason:
We aim to replace big locks with smaller ones. So we will use spin_lock_irqsave extensively to
replace enter_critical_section in the subsequent process. We imitate the implementation of Linux
by adding sched_lock to spin_lock_irqsave in order to address scenarios where sem_post occurs
within spin_lock_irqsave, which can lead to spinlock failures and deadlocks.

Signed-off-by: hujun5 <hujun5@xiaomi.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Arch: arm Issues related to ARM (32-bit) architecture Arch: arm64 Issues related to ARM64 (64-bit) architecture Arch: risc-v Issues related to the RISC-V (32-bit or 64-bit) architecture Arch: simulator Issues related to the SIMulator Arch: x86_64 Issues related to the x86_64 architecture Arch: xtensa Issues related to the Xtensa architecture Area: OS Components OS Components issues Size: L The size of the change in this PR is large
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Thread holding spinlock can be blocked. Thread can be unexpectedly suspended within a critical section.
5 participants