Skip to content

Commit

Permalink
cygthread: suspend thread before terminating.
Browse files Browse the repository at this point in the history
This addresses an extremely difficult to debug deadlock when running
under emulation on ARM64.

A relatively easy way to trigger this bug is to call `fork()`, then
within the child process immediately call another `fork()` and then
`exit()` the intermediate process.

It would seem that there is a "code emulation" lock on the wait thread
at this stage, and if the thread is terminated too early, that lock
still exists albeit without a thread, and nothing moves forward.

It seems that a `SuspendThread()` combined with a `GetThreadContext()`
(to force the thread to _actually_ be suspended, for more details see
https://devblogs.microsoft.com/oldnewthing/20150205-00/?p=44743) makes
sure the thread is "booted" from emulation before it is suspended.

Hopefully this means it won't be holding any locks or otherwise leave
emulation in a bad state when the thread is terminated.

Also, attempt to use `CancelSynchonousIo()` (as seen in `flock.cc`) to
avoid the need for `TerminateThread()` altogether.  This doesn't always
work, however, so was not a complete fix for the deadlock issue.

Addresses: https://cygwin.com/pipermail/cygwin-developers/2024-May/012694.html
Signed-off-by: Jeremy Drake <cygwin@jdrake.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
  • Loading branch information
dscho committed Nov 12, 2024
1 parent 6183b83 commit 6469fc6
Show file tree
Hide file tree
Showing 2 changed files with 16 additions and 1 deletion.
14 changes: 14 additions & 0 deletions winsup/cygwin/cygthread.cc
Original file line number Diff line number Diff line change
Expand Up @@ -302,6 +302,20 @@ cygthread::terminate_thread ()
if (!inuse)
goto force_notterminated;

if (_my_tls._ctinfo != this)
{
CONTEXT context;
context.ContextFlags = CONTEXT_CONTROL;
/* SuspendThread makes sure a thread is "booted" from emulation before
it is suspended. As such, the emulator hopefully won't be in a bad
state (aka, holding any locks) when the thread is terminated. */
SuspendThread (h);
/* We need to call GetThreadContext, even though we don't care about the
context, because SuspendThread is asynchronous and GetThreadContext
will make sure the thread is *really* suspended before returning */
GetThreadContext (h, &context);
}

TerminateThread (h, 0);
WaitForSingleObject (h, INFINITE);
CloseHandle (h);
Expand Down
3 changes: 2 additions & 1 deletion winsup/cygwin/sigproc.cc
Original file line number Diff line number Diff line change
Expand Up @@ -410,7 +410,8 @@ proc_terminate ()
if (!have_execed || !have_execed_cygwin)
chld_procs[i]->ppid = 1;
if (chld_procs[i].wait_thread)
chld_procs[i].wait_thread->terminate_thread ();
if (!CancelSynchronousIo (chld_procs[i].wait_thread->thread_handle ()))
chld_procs[i].wait_thread->terminate_thread ();
/* Release memory associated with this process unless it is 'myself'.
'myself' is only in the chld_procs table when we've execed. We
reach here when the next process has finished initializing but we
Expand Down

0 comments on commit 6469fc6

Please sign in to comment.