-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rootlessport: fix potential hang #5183
rootlessport: fix potential hang #5183
Conversation
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: giuseppe The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@AkihiroSuda PTAL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. LGTM
} | ||
errCh <- driverErr |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe close errCh?
LGTM once comment is addressed |
write to the error pipe only in case of an error. Otherwise we may end up in a race condition in the select statement below as the read from errChan happens before initComplete and the function returns immediately nil. Closes: containers#5182 Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
LGTM once the comment is addressed. CI isn't buying it, probably not related. |
cf944ee
to
51d3693
Compare
it took a while as I've found another race condition. More details in the second patch. I've tried |
@nalind I see the reexec package is using |
there is a race condition where the child process is immediately killed: [pid 2576752] arch_prctl(0x3001 /* ARCH_??? */, 0x7ffdf612f170) = -1 EINVAL (Invalid argument) [pid 2576752] access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory) [pid 2576752] --- SIGTERM {si_signo=SIGTERM, si_code=SI_USER, si_pid=2576742, si_uid=0} --- [pid 2576752] +++ killed by SIGTERM +++ this happens because the parent process here really means the "parent thread". Since there is no way of running it on the main thread, let's skip this functionality altogether and use kill(2). Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
51d3693
to
5b69e7f
Compare
/lgtm |
cmd.Args = []string{reexecChildKey} | ||
cmd.Stdin = childQuitR | ||
cmd.Stdout = &logrusWriter{prefix: "child"} | ||
cmd.Stderr = cmd.Stdout | ||
cmd.Env = append(os.Environ(), reexecChildEnvOpaque+"="+string(opaqueJSON)) | ||
cmd.SysProcAttr = &syscall.SysProcAttr{ | ||
Pdeathsig: syscall.SIGTERM, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what was the problem?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the child process is killed whenever the parent thread exits which we have no control about since it is managed by the Go runtime.
PR to address the same race condition in containers/storage: containers/storage#530 |
write to the error pipe only in case of an error. Otherwise we may
end up in a race condition in the select statement below as the read
from errChan happens before initComplete and the function returns
immediately nil.
Closes: #5182
Signed-off-by: Giuseppe Scrivano gscrivan@redhat.com