Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

failed to write to /proc/self/oom_score_adj: Permission denied #3024

Closed
dac73 opened this issue Apr 26, 2019 · 56 comments
Closed

failed to write to /proc/self/oom_score_adj: Permission denied #3024

dac73 opened this issue Apr 26, 2019 · 56 comments
Labels
kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments.

Comments

@dac73
Copy link

dac73 commented Apr 26, 2019

Is this a BUG REPORT or FEATURE REQUEST? (leave only one on its own line)

/kind bug

Description

When running rootless podman I'm getting failed to write to /proc/self/oom_score_adj: Permission denied error, but container runs after it.

Steps to reproduce the issue:

  1. podman run -it --rm anycontainer

Describe the results you received:
The container runs, but after I get an error message in the terminal

Describe the results you expected:
The container should start without error message.

Additional information you deem important (e.g. issue happens only occasionally):
Error message: failed to write to /proc/self/oom_score_adj: Permission denied
Output of podman version:

podman version 1.2.0

Output of podman info --debug:

debug:
  compiler: gc
  git commit: ""
  go version: go1.11.6
  podman version: 1.2.0
host:
  BuildahVersion: 1.7.2
  Conmon:
    package: podman-1.2.0-1.1.x86_64
    path: /usr/lib/podman/bin/conmon
    version: "failed to write to /proc/self/oom_score_adj: Permission denied, conmon
      version 1.14.0\ncommit: "
  Distribution:
    distribution: '"opensuse-tumbleweed"'
    version: "20190423"
  MemFree: 2402082816
  MemTotal: 11992973312
  OCIRuntime:
    package: runc-1.0.0~rc6-3.1.x86_64
    path: /usr/bin/runc
    version: |-
      runc version 1.0.0-rc6
      spec: 1.0.1-dev
  SwapFree: 2145845248
  SwapTotal: 2147479552
  arch: amd64
  cpus: 4
  hostname: kraken
  kernel: 5.0.8-1-default
  os: linux
  rootless: true
  uptime: 4h 43m 30.3s (Approximately 0.17 days)
insecure registries:
  registries: []
registries:
  registries:
  - docker.io
store:
  ConfigFile: /home/dario/.config/containers/storage.conf
  ContainerStore:
    number: 1
  GraphDriverName: vfs
  GraphOptions: null
  GraphRoot: /home/dario/.local/share/containers/storage
  GraphStatus: {}
  ImageStore:
    number: 20
  RunRoot: /tmp/1000
  VolumePath: /home/dario/.local/share/containers/storage/volumes

Additional environment details (AWS, VirtualBox, physical, etc.):
physical

@openshift-ci-robot openshift-ci-robot added the kind/bug Categorizes issue or PR as related to a bug. label Apr 26, 2019
@rhatdan
Copy link
Member

rhatdan commented Apr 26, 2019

Rootless?
Is this an SELinux issue? Does it work in permissive mode?

@edsantiago
Copy link
Member

Not an SELinux issue: I'm seeing it on my Gentoo laptop, which (I know, I know!) does not have SELinux enabled.

@dac73
Copy link
Author

dac73 commented Apr 26, 2019

It's rootless.

@rhatdan
Copy link
Member

rhatdan commented Apr 26, 2019

I tried on Fedora and it is allowed.
Could you try in --privileged and see if it allowed.

@rhatdan
Copy link
Member

rhatdan commented Apr 26, 2019

It it works with --privileged then try without --privileged and try
--security-opt seccomp=unconfined

@dac73
Copy link
Author

dac73 commented Apr 26, 2019

Same error for both options.

@edsantiago
Copy link
Member

Same here (i.e. --privileged and --security-opt seccomp=unconfined make no difference)

@mheon
Copy link
Member

mheon commented Apr 26, 2019 via email

@vrothberg
Copy link
Member

vrothberg commented Apr 26, 2019

@mheon, do you know when it has been changed? The podman v1.2.0 package in openSUSE is using conmon from CRI-O 1.14.0.

@mheon
Copy link
Member

mheon commented Apr 26, 2019

I'm pretty sure @haircommander made the change in question, so I'll tag him in for that one

@haircommander
Copy link
Collaborator

haircommander commented Apr 26, 2019

this is conmon, I didn't know any podman was shipping with that updated of a CRI-O version. We expect this debug message, though I actually think conmon's log handling is weird at the moment. (note, the change I made just made this error non fatal)

@haircommander
Copy link
Collaborator

I feel like this should just silently happen tbh

@vrothberg
Copy link
Member

Looking forward to c/conmon kicking off :) @sysrich, I think the openSUSE could downgrade to an older conmon for podman to quick-fix the issue.

@haircommander
Copy link
Collaborator

@sysrich @vrothberg release 1.13.3 should not have this problem

@haircommander
Copy link
Collaborator

ah actually, we did fix this problem upstream, but haven't cut a new release on 1.14 with the updates yet. I will look into it, but for the time being I'd go back to 1.13.3 so users don't think rootless is failing

@edsantiago
Copy link
Member

My laptop updated cri-o this morning, 1.13.5 to 1.13.7, and the oom_score_adj warning is gone.

@baude
Copy link
Member

baude commented May 29, 2019

are we good to close this?

@baude baude added the retiring label May 29, 2019
@haircommander
Copy link
Collaborator

yes--people will need to use containers/conmon 0.2.0

@dac73
Copy link
Author

dac73 commented Jun 3, 2019

This was fixed for me with the last TW update.
Conmon is @0.2.0
Thanks.

@rhatdan rhatdan closed this as completed Jun 4, 2019
@tobwen
Copy link
Contributor

tobwen commented Oct 19, 2019

Problem returned in rootless :(

Describe the results you received:

[conmon:d]: failed to write to /proc/self/oom_score_adj: Permission denied

Output of podman version:

Version:            1.6.3-dev
RemoteAPI Version:  1
Go Version:         go1.11.5
OS/Arch:            linux/amd64

Output of podman info --debug:

Version:            1.6.3-dev
RemoteAPI Version:  1
Go Version:         go1.11.5
OS/Arch:            linux/amd64
tobwen@pgsql:~/podman/usr/local/bin$ ./podman --tmpdir /tmp/user/1000/libpod/tmp info debug
Error: `podman system info` takes no arguments
tobwen@pgsql:~/podman/usr/local/bin$ ./podman --tmpdir /tmp/user/1000/libpod/tmp info --debug
debug:
  compiler: gc
  git commit: ""
  go version: go1.11.5
  podman version: 1.6.3-dev
host:
  BuildahVersion: 1.11.3
  CgroupVersion: v1
  Conmon:
    package: Unknown
    path: /home/tobwen/podman/usr/local/bin/conmon
    version: 'conmon version 2.0.3-dev, commit: bc758d8bd98a29ac3aa4f62a886575bfec0e39a1'
  Distribution:
    distribution: debian
    version: "9"
  IDMappings:
    gidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
    - container_id: 65537
      host_id: 1258512
      size: 65536
    uidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
    - container_id: 65537
      host_id: 1258512
      size: 65536
  MemFree: 35587067904
  MemTotal: 38205444096
  OCIRuntime:
    name: runc
    package: Unknown
    path: /home/tobwen/podman/usr/local/bin/runc
    version: |-
      runc version 1.0.0-rc9+dev
      commit: 4e3701702e966b4258fbab5b92efa6418c5ae6c6
      spec: 1.0.1-dev
  SwapFree: 8586784768
  SwapTotal: 8586784768
  arch: amd64
  cpus: 8
  eventlogger: journald
  hostname: pgsql
  kernel: 4.19.0-0.bpo.6-amd64
  os: linux
  rootless: true
  uptime: 29h 19m 13.88s (Approximately 1.21 days)
registries:
  blocked: null
  insecure: null
  search:
  - docker.io
store:
  ConfigFile: /home/tobwen/.config/containers/storage.conf
  ContainerStore:
    number: 3
  GraphDriverName: vfs
  GraphOptions: {}
  GraphRoot: /home/tobwen/.local/share/containers/storage
  GraphStatus: {}
  ImageStore:
    number: 1
  RunRoot: /tmp/user/1000
  VolumePath: /home/tobwen/.local/share/containers/storage/volumes

@rhatdan
Copy link
Member

rhatdan commented Oct 20, 2019

@tobwen what command were you executing when you saw this error? If you turn on debug logging you will still see it.

@tobwen
Copy link
Contributor

tobwen commented Oct 20, 2019

Due to bugs with the local config, I have to use this commandline

/home/tobwen/podman/usr/local/bin/podman --log-level=debug \
--tmpdir /tmp/user/1000/libpod/tmp \
--conmon /home/tobwen/podman/usr/local/bin/conmon \
--network-cmd-path /home/tobwen/podman/usr/local/bin/slirp4netns \
--runtime /home/tobwen/podman/usr/local/bin/runc \
--storage-driver overlay \
--storage-opt "overlay.mount_program=/home/tobwen/podman/usr/local/bin/fuse-overlayfs"

When not using debug logging, I don't see it.

But that's interesting:

$ ls -al /proc/self/oom_score_adj
-rw-r--r-- 1 tobwen tobwen 0 Oct 20 12:43 /proc/self/oom_score_adj

This file seems to belong to the current use. Or does conmon try to read/write from within a namespace?

@rhatdan
Copy link
Member

rhatdan commented Oct 21, 2019

It might be happening within a user namespace.

Both of these are successful

$ echo 1 > /proc/self/oom_score_adj 
$ podman unshare echo 1 > /proc/self/oom_score_adj 
$ podman run fedora echo 1 > /proc/self/oom_score_adj

All of these work.  But

echo -1 > /proc/self/oom_score_adj 
bash: echo: write error: Permission denied

@rhatdan
Copy link
Member

rhatdan commented Oct 21, 2019

Conmon is attempting to set this score on rootless which is not allowed.

conmon.c:#define OOM_SCORE "-999"
conmon.c:		if (write(oom_score_fd, OOM_SCORE, strlen(OOM_SCORE)) < 0) {

@jasonbrooks
Copy link

I'm trying to get toolbox running in silverblue 31. It fails with:

Error: unable to start container "fedora-toolbox-31": writing file '/proc/46288/gid_map': Operation not permitted
setgid(0): Invalid argument: OCI runtime permission denied error
toolbox: failed to start container fedora-toolbox-31

Elsewhere, I saw a suggestion about running with: systemd-run --scope --user podman --log-level debug start fedora-toolbox-31. When I do that, I get an error like I'm seeing in here:

[conmon:d]: failed to write to /proc/self/oom_score_adj: Permission denied

@haircommander
Copy link
Collaborator

@jasonbrooks that message is expected, it won't be printed without --log-level debug

@jasonbrooks
Copy link

@haircommander Ah, so not related to my toolbox issue

@bosd
Copy link

bosd commented Sep 17, 2023

Update: This solved it for me:

edit /usr/lib/systemd/system/user@.service and remove the following line.

OOMScoreAdjust=100
systemd/systemd@ce7de0b#diff-e712f4f510835ff8be07a088402d790305356f9284023c7a286056ccf38147e3R28

Reboot

Source: https://bbs.archlinux.org/viewtopic.php?pid=2012166#p2012166

@GregJohnStewart
Copy link

In case this helps determine the issue, I have attached two inspect outputs, one for run and one from the (failing) start.

They aren't quite the same image, due to the start being created from the java api for Docker, but they are close.

inspects.zip

@MaximilianGaedig
Copy link

MaximilianGaedig commented Sep 18, 2023

@bosd thank you for the fix, this worked for me, but this fix is not optimal, this makes user processes get oom killed with the same priority as if they were root processes , while they should be prioritized by my understanding.

@matzew
Copy link

matzew commented Sep 18, 2023

I have updated to Podman 4.6.2 and noticed the regression.

The workaround from @bosd worked on my box - but not sure that's really optimal, as @MaximilianGaedig states in his comment

@rhafer
Copy link
Contributor

rhafer commented Sep 18, 2023

BTW this is tracked in #19930 and #19843 is supposed to fix it. (downgrading crun to 1.8.7 apparently also helps)

@bruno-fs
Copy link

I recently updated to podman 4.6.2 on fedora 38 and noticed this issue as well.

@setofaces
Copy link

Update: This solved it for me:

edit /usr/lib/systemd/system/user@.service and remove the following line.

OOMScoreAdjust=100 [systemd/systemd@ce7de0b#diff-

Reboot

Source: https://bbs.archlinux.org/viewtopic.php?pid=2012166#p2012166

MacOS with latest podman here. I remove provided line via podman machine ssh connection however after restart of the VM, this property of OOMScoreAdjust=100 appears to be there again and error expectedly repeats, is there any workaround in that case? Thank you

@setofaces
Copy link

UPD: rebooting via ssh sudo prevents overwrting this file, but problem still remains even without OOMScoreAdjust=100

@jorhett
Copy link

jorhett commented Sep 22, 2023

4 years later? The old bug should be locked and a new bug should be opened 🙏

@jenifera0110
Copy link

I have updated to Podman 4.6.2 and noticed the regression.

The workaround from @bosd worked on my box - but not sure that's really optimal, as @MaximilianGaedig states in his comment

chmod: changing permissions of '/usr/lib/systemd/system/user@.service': Read-only file system. Sudo doesn't work, how do we write this file to remove that property ?

@setofaces
Copy link

I have updated to Podman 4.6.2 and noticed the regression.
The workaround from @bosd worked on my box - but not sure that's really optimal, as @MaximilianGaedig states in his comment

chmod: changing permissions of '/usr/lib/systemd/system/user@.service': Read-only file system. Sudo doesn't work, how do we write this file to remove that property ?
Just mount directory, then change permissions

@stan-shih
Copy link

I have updated to Podman 4.6.2 and noticed the regression.
The workaround from @bosd worked on my box - but not sure that's really optimal, as @MaximilianGaedig states in his comment

chmod: changing permissions of '/usr/lib/systemd/system/user@.service': Read-only file system. Sudo doesn't work, how do we write this file to remove that property ? Just mount directory, then change permissions

I fixed it below the link(#19930 (comment)). You can try it.

@jenifera0110
Copy link

I have updated to Podman 4.6.2 and noticed the regression.
The workaround from @bosd worked on my box - but not sure that's really optimal, as @MaximilianGaedig states in his comment

chmod: changing permissions of '/usr/lib/systemd/system/user@.service': Read-only file system. Sudo doesn't work, how do we write this file to remove that property ? Just mount directory, then change permissions

I fixed it below the link(#19930 (comment)). You can try it.

This doesn't help me.

Screenshot 2023-09-25 at 09 19 37

@kdubois
Copy link

kdubois commented Sep 25, 2023

I ran into the same issue on Fedora 38. Per a comment in #19930 I downgraded crun sudo dnf downgrade crun and that seems to have worked for now.

@WinterWolf98
Copy link

Was able to solve this by upgrading to Fedora v38.20230902.3.0

@fansari
Copy link

fansari commented Oct 1, 2023

I have updated from Fedora Silverblue 38 to 39 and one of the first things I did was to check whether podman is still working. Nope! All rootless containers fail with

crun: write to `/proc/self/oom_score_adj`: Permission denied: OCI permission denied

Since /usr is write protected in Fedora Silverblue I have copied user@.service:

cp /usr/lib/systemd/system/user@.service /etc/systemd/system

I can confirm that after a reboot rootless container start again.

podman version 4.6.2
crun version 1.9

@stemid
Copy link

stemid commented Oct 13, 2023

Update: This solved it for me:

edit /usr/lib/systemd/system/user@.service and remove the following line.

OOMScoreAdjust=100 systemd/systemd@ce7de0b#diff-e712f4f510835ff8be07a088402d790305356f9284023c7a286056ccf38147e3R28

Reboot

Source: https://bbs.archlinux.org/viewtopic.php?pid=2012166#p2012166

Since this thread comes up on Google I just want to add that this helped me with running Gitlab Runner in a rootless podman on CoreOS. Since /usr is write protected, I copied user@.service to /etc/systemd/system, and removed that one line, rebooted and now Gitlab runner can run jobs again.

Just two releases ago I did not have this issue, so it must be a very recent bug in CoreOS 38.

@andrewgdunn
Copy link

andrewgdunn commented Oct 13, 2023

@stemid appreciate that comment, I've not attempted this yet on my systems (which are centos9stream gitlab runners)... wondering if it would be better to consider the override mechanism within systemd.

@stemid
Copy link

stemid commented Oct 14, 2023

@stemid appreciate that comment, I've not attempted this yet on my systems (which are centos9stream gitlab runners)... wondering if it would be better to consider the override mechanism within systemd.

Yeah that is actually what I ended up doing, my comment was a bit premature. I ended up using the equivalent of systemctl edit user@.service but since I deploy on CoreOS with Ignition that means I created the directory /etc/systemd/system/user@.service.d with the file override.conf and just two lines;

[Service]
OOMScoreAdjust=

@edgarjoao
Copy link

Hi @stemid
I just followed the steps provided by you, but still not able to start any container.

Here is my configuration.

Fedora Linux 39 (Server Edition)
Podman
Client: Podman Engine
Version: 4.7.2
API Version: 4.7.2
Go Version: go1.21.1
Built: Tue Oct 31 08:32:01 2023
OS/Arch: linux/amd64

crun version 1.11.2
commit: ab0edeef1c331840b025e8f1d38090cfb8a0509d
rundir: /run/user/1001/crun
spec: 1.0.0
+SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +LIBKRUN +WASM:wasmedge +YAJL

Any advice?

Thanks,
Edgar

@luckylinux
Copy link

luckylinux commented Nov 30, 2023

Could this be due (also) to ACLs ? Specifically in my case, I had to set ZFS acltype=posix in order to NOT encounter issues while trying to spin up e.g. Mosquitto MQTT Container (otherwise I'd get mkdir - operation not permitted). See #11213.

However, now that ACLs are enabled in ZFS, it became much slower. I tried to set xattr=sa to make it faster by storing extended attributes in the inode/dnode directly, but that didn't really help.

podman start mycontainer or even podman ps hangs indefinitively. Only solution is to reboot !

Then the hanging/freezing stopped and now I get this

[conmon:d]: failed to write to /proc/self/oom_score_adj: Permission denied

I also tried the previous fixes, both in /etc/systemd/system/user@.service and /etc/systemd/system/user@.service.d/override.conf but it doesn't help.

I have a much older version of Podman being on Debian Linux but still ...
It also seems after enabling ACL I get more and more weird errors

host:
  arch: arm64
  buildahVersion: 1.28.2
  cgroupControllers:
  - cpu
  - memory
  - pids
  cgroupManager: systemd
  cgroupVersion: v2
  conmon:
    package: conmon_2.1.6+ds1-1_arm64
    path: /usr/bin/conmon
    version: 'conmon version 2.1.6, commit: unknown'
  cpuUtilization:
    idlePercent: 98.58
    systemPercent: 0.55
    userPercent: 0.87
  cpus: 8
  distribution:
    codename: bookworm
    distribution: debian
    version: "12"
  eventLogger: journald
  hostname: Rock5B-01
  idMappings:
    gidmap:
    - container_id: 0
      host_id: 1002
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
    uidmap:
    - container_id: 0
      host_id: 1002
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
  kernel: 6.6.2-1-arm64
  linkmode: dynamic
  logDriver: journald
  memFree: 15235072000
  memTotal: 16477868032
  networkBackend: netavark
  ociRuntime:
    name: crun
    package: crun_1.8.1-1+b1_arm64
    path: /usr/bin/crun
    version: |-
      crun version 1.8.1
      commit: f8a096be060b22ccd3d5f3ebe44108517fbf6c30
      rundir: /run/user/1002/crun
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +YAJL
  os: linux
  remoteSocket:
    path: /run/user/1002/podman/podman.sock
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: true
    seccompEnabled: true
    seccompProfilePath: /usr/share/containers/seccomp.json
    selinuxEnabled: false
  serviceIsRemote: false
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: slirp4netns_1.2.0-1_arm64
    version: |-
      slirp4netns version 1.2.0
      commit: 656041d45cfca7a4176f6b7eed9e4fe6c11e8383
      libslirp: 4.7.0
      SLIRP_CONFIG_VERSION_MAX: 4
      libseccomp: 2.5.4
  swapFree: 0
  swapTotal: 0
  uptime: 0h 10m 58.00s
plugins:
  authorization: null
  log:
  - k8s-file
  - none
  - passthrough
  - journald
  network:
  - bridge
  - macvlan
  volume:
  - local
registries:
  search:
  - registry.fedoraproject.org
  - registry.access.redhat.com
  - docker.io
  - quay.io
store:
  configFile: /home/podman/.config/containers/storage.conf
  containerStore:
    number: 2
    paused: 0
    running: 1
    stopped: 1
  graphDriverName: overlay
  graphOptions:
    overlay.mount_program:
      Executable: /usr/bin/fuse-overlayfs
      Package: fuse-overlayfs_1.10-1_arm64
      Version: |-
        fusermount3 version: 3.14.0
        fuse-overlayfs: version 1.10
        FUSE library version 3.14.0
        using FUSE kernel interface version 7.31
    overlay.mountopt: nodev,metacopy=on
  graphRoot: /home/podman/storage
  graphRootAllocated: 1931117854720
  graphRootUsed: 1314783232
  graphStatus:
    Backing Filesystem: zfs
    Native Overlay Diff: "false"
    Supports d_type: "true"
    Using metacopy: "false"
  imageCopyTmpDir: /var/tmp
  imageStore:
    number: 3
  runRoot: /run/user/1002/containers
  volumePath: /home/podman/storage/volumes
version:
  APIVersion: 4.3.1
  Built: 0
  BuiltTime: Thu Jan  1 00:00:00 1970
  GitCommit: ""
  GoVersion: go1.19.8
  Os: linux
  OsArch: linux/arm64
  Version: 4.3.1

Strangely it doesn't happen with all containers. I could get (again) podman-compose to bring down (destroy) & up (recreate) an instance of Home Assistant, while Mosquitto MQTT Container Image stubbornly refuses to work.

@edgarjoao
Copy link

edgarjoao commented Dec 6, 2023

Hi there,
After install docker, I was able to run podman containers

sudo dnf -y install dnf-plugins-core sudo dnf config-manager --add-repo https://download.docker.com/linux/fedora/docker-ce.repo

sudo dnf install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin

sudo systemctl start docker

@github-actions github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Mar 6, 2024
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Mar 6, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments.
Projects
None yet
Development

No branches or pull requests