-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
libct: don't close syncPipe #4183
Conversation
Recent CVE-2024-21626 fix (commit f2f1621) broke a recently added test (commit 0bc4732) because if exec fails, runc is unable to communicate the error back to the parent. Let's not close syncPipe. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
@@ -154,7 +154,7 @@ func (l *linuxSetnsInit) Init() error { | |||
// that all O_CLOEXEC file descriptors have already been closed and thus | |||
// the second execve(2) from runc-dmz cannot access internal file | |||
// descriptors from runc. | |||
if err := utils.UnsafeCloseFrom(l.config.PassedFilesCount + 3); err != nil { | |||
if err := utils.UnsafeCloseFrom(l.config.PassedFilesCount+3, int(l.pipe.File().Fd())); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this fd created with the close on exec option? I guess so, but it will be nice to verify it :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes it is:
runc/libcontainer/sync_unix.go
Line 77 in 0212048
fds, err := unix.Socketpair(unix.AF_LOCAL, unix.SOCK_SEQPACKET|unix.SOCK_CLOEXEC, 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perfect!
Sorry, something changed and while I can answer now, I can't mark this as resolved.
// We cannot use close_range(2) even if it is available, because we must | ||
// not close some file descriptors. | ||
return fdRangeFrom(minFd, func(fd int) { | ||
for _, ex := range except { | ||
if fd == ex { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shall we try to check if the fd in question has the closeOnExec flag set before leaving it open?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First, having close on exec flag is not a panacea.
Second, the one we're excepting is not a file or directory, so no CVE-2024-21626 style exploits are possible.
Maybe we can check that excepted fd is not referring to a file or directory...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something like this? Or is this check totally redundant? WDYT @rata?
diff --git a/libcontainer/utils/utils_unix.go b/libcontainer/utils/utils_unix.go
index 87055358..4daa40d4 100644
--- a/libcontainer/utils/utils_unix.go
+++ b/libcontainer/utils/utils_unix.go
@@ -130,6 +130,18 @@ func runtime_IsPollDescriptor(fd uintptr) bool //nolint:revive
func UnsafeCloseFrom(minFd int, except ...int) error {
// We cannot use close_range(2) even if it is available, because we must
// not close some file descriptors.
+
+ // Currently we only allow sockets to be excepted.
+ for _, fd := range except {
+ var st unix.Stat_t
+ err := unix.Fstat(fd, &st)
+ if err != nil {
+ return os.NewSyscallError("fstat", err)
+ }
+ if st.Mode&unix.S_IFMT != unix.S_IFSOCK {
+ return fmt.Errorf("excepted fd %d is not a socket", fd)
+ }
+ }
return fdRangeFrom(minFd, func(fd int) {
for _, ex := range except {
if fd == ex {
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- What do you mean with "close on exec is not a panacea"? That we don't know if this is a socket, dirfd or whatever, so checking for close on exec can vary?
- Can't fstat() be costly?
- Why wouldn't we check that even the socket is close on exec? We don't want to leak socket fds either, right? Sure, the implications might be different of leaking a socket, but why would we want to leak a socket?
But yeah, if it is because of 1, and it isn't easy to do a type assertion or something and check the close on exec, I think we can spare the fstat() call, this diff you pasted and just keep the code as it is.
I would add a big note to the function documentation, saying the fds excepted must be manually verified to be closed on exec, though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- What do you mean with "close on exec is not a panacea"?
What I mean by it is, in addition to having O_CLOEXEC
set, we also have to explicitly close file descriptors before calling execve
. See "Attacks 3a and 3b" in GHSA-xr7r-f8xq-vfvv (in short, something like process.args=/proc/self/fd/7/../../../bin/bash
can be used by an attacker).
So, checking the O_CLOEXEC won't be enough here, this is why I'm checking that it's a socket (i.e. not a file or directory), as one can't use "attacks 3a/3b" with a socket.
- Can't fstat() be costly?
Indeed it can (and it can also fail for some reason). That is why I don't like the checking code. We already know it's a socket when we call this function.
- Why wouldn't we check that even the socket is close on exec?
Because close on exec is not a panacea (see above).
I would add a big note to the function documentation, saying the fds excepted must be manually verified to be closed on exec, though.
I agree about a note (not about close on exec) -- will add.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, right, great point. If we go forward with this, I think a comment is all that is needed (explaining which kind of fds is safe to leave open).
However, the other PR to just close the fd seems better IMHO. We don't risk anything, is simpler, we still enforce all fds are closed...
We have discussed this topic in a private fork, how about close it explicitly using |
I also found the DEBU[0000]libcontainer/sync.go:127 libcontainer.doReadSync() reading sync
DEBU[0000] child process in init()
DEBU[0000] setns_init: about to exec
DEBU[0000]libcontainer/sync.go:131 libcontainer.doReadSync() sync pipe closed
Failed to write to log, write logpipe: file already closed
writing sync procError: write sync: bad file descriptor
exec /run.sh: no such file or directory
DEBU[0000]signals.go:92 main.(*signalHandler).forward() process exited |
@sctb512 please don't hijack this -- this PR is about a totally different issue than the one you have. If you feel there's a bug, open a bug report (or a PR to fix it). If you have a question -- we have /~https://github.com/opencontainers/runc/discussions |
Closing in favor of #4192 which is simpler. |
Recent CVE-2024-21626 fix (commit f2f1621) broke a recently added test (commit 0bc4732, #4173) because if exec fails, runc is unable to communicate the error back to the parent.
Let's not close syncPipe.
This fixes the following test failure: