Process struct should carry an fd and use openat #125

ghost · 2021-05-01T22:17:48Z

Under Linux, procfs can suffer from TOCTTOU caused by pid reuse. This is evident if you take the self_memory example and have it run on external process. I'll show some output from strace to illustrate the problem:

openat(AT_FDCWD, "/proc/1234/statm", O_RDONLY|O_CLOEXEC) = 3
... snip ...
openat(AT_FDCWD, "/proc/1234/status", O_RDONLY|O_CLOEXEC) = 3

In between calls to openat, our process could be pre-empted, and then the pid could potentially be killed and re-used by another process before our process resumes. This would result in the call to Process::status() returning wrong information, when really it should return an error because the original process 1234 it was referencing is dead. To fix the problem, you'd want the syscalls to look more like this:

openat(AT_FDCWD, "/proc/1234", O_RDONLY|O_CLOEXEC) = 3
openat(3, "statm", O_RDONLY|O_CLOEXEC) = 4
openat(3, "status", O_RDONLY|O_CLOEXEC) = 4

The easiest way to do this would be to replace every area where file paths are joined with an intentional call to openat. The net effect of this would be that Process would no longer store root, it would store a handle to the directory in procfs. This should also be done when reading from entries in /proc/[pid]/task/[tid]. Another change that should probably happen along with this is that process::all_processes() should return an Iterator instead of a Vec, that way it will prevent a huge number of file descriptors being opened at once. This is an API break so I'm proposing it here before submitting any pull request.

The text was updated successfully, but these errors were encountered:

eminence · 2021-05-03T14:49:16Z

Hi,

Thanks for opening this issue. At first glance, this sounds like a nice improvement. Your example with "statm" and "status" is well motivated. I would be definitely willing to see a pull request in this area. One reason that the current API for all_processes() returns a Vec instead of an iterator is to reduce the chances of a process disappearing if doing a slow iteration. I'm not sure if openat really helps for all_processes(), though I guess it'll be a more consistent "snapshot"

You have a good point about needed to manage the large number of open FDs. How does returning an iterator help? If the caller immediately collects everything into a Vec, we would still have an open FD for every running process, right?

eminence · 2021-05-03T14:52:27Z

Also, I don't see any support for openat in the stdlib. Would you rely on nix for this functionality? Or use libc?

ghost · 2021-05-04T20:31:06Z

One reason that the current API for all_processes() returns a Vec instead of an iterator is to reduce the chances of a process disappearing if doing a slow iteration.

That will ensure you get a list of pids that was active at the time of iteration but will not prevent the pids from becoming stale. The method I'm suggesting will be slower; that's unavoidable because it's moving from just holding the pid number to now having to do a syscall to grab a reference to a kernel resource. But it will also technically be more correct, the iterator can skip over pids that became stale during processing of the directory.

If we want to preserve the quick iteration, I think it would make more sense to have the method return a Vec of u32 (or more accurately pid_t). That will communicate more what is going on: the pid is just a number that could become stale, you could then pass it to a constructor later that will try to open a handle and return a Result<Process> accordingly.

How does returning an iterator help? If the caller immediately collects everything into a Vec, we would still have an open FD for every running process, right?

Yes. The idea is that for implementing something like top, programs would map over the iterator and pull out the information they need into a different structure. On Linux the default max fds open is 1024, and it's definitely possible to have more than 1024 processes open, so we probably want to avoid pushing up against that limit.

Also, I don't see any support for openat in the stdlib. Would you rely on nix for this functionality? Or use libc?

libc can probably be used if trying to keep dependencies to a minimum, the nix crate is not much more than some minimal safe wrappers around libc functions. Both would work.

lparcq · 2021-05-12T08:38:41Z

The result from all_processes is always wrong whichever method you use. It's better to minimize the number of opened file descriptors. Btw, it makes no sense to use a program like top on a system where pid sare reused faster than reading the result of all_processes().
However, using openat for Process::new would be much more reliable on heavily loaded systems.

ghost · 2021-05-12T14:05:29Z

It won't necessarily be wrong if a small number of fds are kept open, for example a top program might want to keep the fd to the currently highlighted process open. That process could then be signaled in a race-free way using pidfd. If the pids are re-used faster than top can read them (i.e. the openat to /proc/<pid>/stat succeeds but /proc/<pid>/cmdline fails) then that process would be discarded and not shown in the output since it's known that the process is already dead.

ghost mentioned this issue May 11, 2021

RFC: Change Process to use directory fd #127

Closed

sunfishcode mentioned this issue Mar 15, 2022

Revive RFC: Change Process to use directory fd #171

Merged

eminence closed this as completed in #171 Apr 5, 2022

obaudys mentioned this issue Jul 22, 2022

procfs::process::all_processes() can blow "open files" ulimit on a system with a large number of processes #188

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Process struct should carry an fd and use openat #125

Process struct should carry an fd and use openat #125

ghost commented May 1, 2021

eminence commented May 3, 2021

eminence commented May 3, 2021

ghost commented May 4, 2021

lparcq commented May 12, 2021

ghost commented May 12, 2021

Process struct should carry an fd and use openat #125

Process struct should carry an fd and use openat #125

Comments

ghost commented May 1, 2021

eminence commented May 3, 2021

eminence commented May 3, 2021

ghost commented May 4, 2021

lparcq commented May 12, 2021

ghost commented May 12, 2021