Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

segmentation fault with podman and ocrmypdf #315

Closed
Fulguritus opened this issue Jan 10, 2022 · 20 comments · Fixed by #361
Closed

segmentation fault with podman and ocrmypdf #315

Fulguritus opened this issue Jan 10, 2022 · 20 comments · Fixed by #361

Comments

@Fulguritus
Copy link

Fulguritus commented Jan 10, 2022

I originally filed the issue in ocrmypdf: ocrmypdf/OCRmyPDF#889

Essentially, when doing

podman run --network none --rm -i jbarlow83/ocrmypdf:v13.2.0 --jobs 1 -l deu - - <tmp.pdf >out.pdf

the output hangs, and I see a segfault in conmon in dmesg:

[24408.182063] conmon[100696]: segfault at a8a000 ip 00007f025e26d9b5 sp 00007ffcc9e773e8 error 4 in libc.so.6[7f025e11e000+176000]
[24408.182091] Code: fd 74 5f 41 c5 fd 74 67 61 c5 ed eb e9 c5 dd eb f3 c5 cd eb ed c5 fd d7 cd 85 c9 75 48 48 83 ef 80 48 81 ea 80 00 00 00 77 cb <c5> fd 74 4f 01 c5 fd d7 c1 66 90 85 c0 75 5c 83 c2 40 0f 8f c3 00

Address within libc: 0x00007f025e26d9b5 - 0x7f025e11e000 = 0x14F9B5

$ addr2line -e /usr/lib/libc.so.6 -fCi 0x14F9B5
__GI_netname2host
:?

Running the same with log-level=debug, I see

DEBU[0270] Sending signal 2 to container b2799f12bc64a5e78947be68ec1c9c37daaa0de03a47c4de4531e87f32d7551f 
2022-01-10T16:46:08.000913691Z: open pidfd: No such process
ERRO[0275] No exit file for container b2799f12bc64a5e78947be68ec1c9c37daaa0de03a47c4de4531e87f32d7551f found: timed out waiting for file /run/user/1000/libpod/tmp/exits/b2799f12bc64a5e78947be68ec1c9c37daaa0de03a47c4de4531e87f32d7551f: internal libpod error 
ERRO[0275] Error forwarding signal 2 to container b2799f12bc64a5e78947be68ec1c9c37daaa0de03a47c4de4531e87f32d7551f: error sending signal to container b2799f12bc64a5e78947be68ec1c9c37daaa0de03a47c4de4531e87f32d7551f: `/usr/bin/crun kill b2799f12bc64a5e78947be68ec1c9c37daaa0de03a47c4de4531e87f32d7551f 2` failed: exit status 1 

Searching for this on the net brought up a potential fix on https://issueexplorer.com/issue/containers/conmon/251
Of this suggestion, only the first part made it into your code.

It is also worth noting that this does not happen all the time, but that is probably due to the output of ocrmypdf not being deterministic. Additionally, when this happens, podman logs contains some content that should go to stdout in stderr (and vice-versa).

This is executed on:

  • Fedora 35
  • conmon version 2.0.30
  • podman version 3.4.4
  • jbarlow83/ocrmypdf:v13.2.0
  • The input file contains copyrighted material and can be made available to anyone looking into this (but I do not want to simply upload it here)
@Fulguritus
Copy link
Author

Fulguritus commented Jan 13, 2022

Further details:

systemd-coredump[286424]: [🡕] Process 286401 (conmon) of user 1000 dumped core.
                                                
Found module linux-vdso.so.1 with build-id: 6761b48f74742f0a29da87e0229205606b993486
Found module libgpg-error.so.0 with build-id: 081975d0a3416374b4883b2f1639fd3c9df42390
Found module ld-linux-x86-64.so.2 with build-id: b43118df1fdb4c0aff150b6f8f926bccdec2a7f0
Found module libgcc_s.so.1 with build-id: 88564abce789aa42536da1247a57ff6062d61dcb
Found module libgcrypt.so.20 with build-id: 711d41580c5a8649a79a8430a985dac3e25b5ba2
Found module libcap.so.2 with build-id: 0214aa9cc6a8646eb9ec27ab7dda6a2219da6500
Found module liblz4.so.1 with build-id: fd02c4542a1ce1ad6c503b958447be1c37a4afee
Found module libzstd.so.1 with build-id: 0f6e793c8bbdb5bb5683c5b9e61aadfcb7a7454a
Found module liblzma.so.5 with build-id: 7fec53ce7cba9489f130c99cfd3ace82e9dde0ee
Found module libpcre.so.1 with build-id: fcbdb4731e50f20a47b62341a6841dd616fe495d
Found module libc.so.6 with build-id: 644dac2c66a6e0b32674f0ec71e7431bd0c06a63
Found module libseccomp.so.2 with build-id: d9e37e30e12bcd8de6d550fda9c874c8870067d9
Found module libsystemd.so.0 with build-id: e4996d7fd9705035bdd6072e68c9dc65647d7d1c
Found module libglib-2.0.so.0 with build-id: 3720108994016259cac8f61f174ea32513518152
Found module conmon with build-id: d185e6305296ab5c3539014320a9f083e49b214c
Stack trace of thread 286401:
#0  0x00007f6cbbc2a8b5 __memchr_avx2 (libc.so.6 + 0x17b8b5)
#1  0x00007f6cbbd38585 sd_journal_sendv (libsystemd.so.0 + 0x5f585)
#2  0x00007f6cbbd3a061 sd_journal_sendv_with_location (libsystemd.so.0 + 0x61061)
#3  0x0000000000405e10 write_journald (conmon + 0x5e10)
#4  0x0000000000406364 write_to_logs (conmon + 0x6364)
#5  0x00000000004093e9 read_stdio (conmon + 0x93e9)
#6  0x0000000000409445 stdio_cb (conmon + 0x9445)
#7  0x00007f6cbbe00130 g_main_context_dispatch (libglib-2.0.so.0 + 0x55130)
#8  0x00007f6cbbe55208 g_main_context_iterate.constprop.0 (libglib-2.0.so.0 + 0xaa208)
#9  0x00007f6cbbdff853 g_main_loop_run (libglib-2.0.so.0 + 0x54853)
#10 0x0000000000404d9a main (conmon + 0x4d9a)
#11 0x00007f6cbbadc560 __libc_start_call_main (libc.so.6 + 0x2d560)
#12 0x00007f6cbbadc60c __libc_start_main@@GLIBC_2.34 (libc.so.6 + 0x2d60c)
#13 0x0000000000405235 _start (conmon + 0x5235)

Stack trace of thread 286403:
#0  0x00007f6cbbbb473f __poll (libc.so.6 + 0x10573f)
#1  0x00007f6cbbe5519c g_main_context_iterate.constprop.0 (libglib-2.0.so.0 + 0xaa19c)
#2  0x00007f6cbbdfd933 g_main_context_iteration (libglib-2.0.so.0 + 0x52933)
#3  0x00007f6cbbdfd981 glib_worker_main (libglib-2.0.so.0 + 0x52981)
#4  0x00007f6cbbe2a842 g_thread_proxy (libglib-2.0.so.0 + 0x7f842)
#5  0x00007f6cbbb3ca87 start_thread (libc.so.6 + 0x8da87)
#6  0x00007f6cbbbc1640 __clone3 (libc.so.6 + 0x112640)

@Fulguritus
Copy link
Author

Does this mean the issue is related to logging? By the nature of ocrmypdf, there will be (large) binary streams sent to STDOUT. Could it be that the construction of bufv in write_journald() cannot handle that kind of data or this amount of data? Could well be several (dozen/hundred) MBs of data in a scanned PDF.

@Fulguritus
Copy link
Author

Is this related to #264?

@jess-sol
Copy link

jess-sol commented Feb 1, 2022

I might be seeing a similar or the same issue:

podman run -ti centos:latest dd if=/dev/random of=/dev/stdout bs=1024 count=1024 > /dev/null # works
podman run -ti centos:latest dd if=/dev/random of=/dev/stdout bs=1024 count=10240 > /dev/null # fails

I get this error:

2022-02-01T17:32:05.000071504Z: open pidfd: No such process
ERRO[0084] No exit file for container 07f817432f8e9b45a5f2160cd5c56514a936cb509d9a39bfb587a7d5343fda14 found: timed out waiting for file /run/user/1000/libpod/tmp/exits/07f817432f8e9b45a5f2160cd5c56514a936cb509d9a39bfb587a7d5343fda14: internal libpod error
ERRO[0084] Error forwarding signal 28 to container 07f817432f8e9b45a5f2160cd5c56514a936cb509d9a39bfb587a7d5343fda14: error sending signal to container 07f817432f8e9b45a5f2160cd5c56514a936cb509d9a39bfb587a7d5343fda14: `/usr/bin/crun kill 07f817432f8e9b45a5f2160cd5c56514a936cb509d9a39bfb587a7d5343fda14 28` failed: exit status 1

And see this seg fault in dmesg

[ 2534.499889] conmon[37735]: segfault at 2187000 ip 00007f94c900f293 sp 00007ffd88aa3fc8 error 4 in libc.so.6[7f94c8f91000+176000]
[ 2534.503488] Code: d7 c3 85 c0 0f 85 ad 00 00 00 83 ea 10 0f 8e c4 00 00 00 66 0f 74 4f 30 66 0f d7 c1 85 c0 0f 85 a3 00 00 00 31 c0 c3 83 c2 20 <66> 0f 6f 07 66 0f 74 c1 66 0f d7 c0 85 c0 75 5d 83 ea 10 0f 86 94

Versions:

  • Fedora 35 (versions 35.20220103.3.0 and 35.20220116.1.1)
  • Conmon 2.0.30
  • Podman 3.4.4

The error is from the built version of #264, same error with 2.0.30 too

@lyarwood
Copy link

lyarwood commented Feb 1, 2022

I'm also seeing this when building a dev environment for the kubevirt project. For whatever reason part of the current process is to dump a 40mb binary file to the stdout of a container before piping it into a local file:

/~https://github.com/kubevirt/kubevirt/blob/f16dd510918daf5ff4653aeae055822cdd99ce4e/cluster-up/cluster/k8s-provider-common.sh#L125

  • Fedora 35
  • conmon-2.0.30-2.fc35.x86_64 (also reproduced with 2.1.0 from HEAD)
$ cd kubevirt
$ sudo make cluster-up
[..]
++ podman run --privileged --net=host --rm -v /root/podman.sock:/var/run/docker.sock -v /lib/modules/:/lib/modules/ quay.io/kubevirtci/gocli:2201311315-109a7c39 scp --prefix k8s-1.20 /usr/bin/kubectl -
^CERRO[0000] container not running                        
ERRO[0024] No exit file for container f3f07950d5da92bd1417f93683a8e9512f20f7055025df241c7c9bfb7b3abc48 found: timed out waiting for file /run/libpod/exits/f3f07950d5da92bd1417f93683a8e9512f20f7055025df241c7c9bfb7b3abc48: internal libpod error 
ERRO[0024] forwarding signal 2 to container f3f07950d5da92bd1417f93683a8e9512f20f7055025df241c7c9bfb7b3abc48: error sending signal to container f3f07950d5da92bd1417f93683a8e9512f20f7055025df241c7c9bfb7b3abc48: `/usr/bin/runc kill f3f07950d5da92bd1417f93683a8e9512f20f7055025df241c7c9bfb7b3abc48 2` failed: exit status 1 
make: *** [Makefile:138: cluster-up] Interrupt

$ sudo coredumpctl dump /usr/bin/conmon
           PID: 240263 (conmon)
           UID: 0 (root)
           GID: 0 (root)
        Signal: 11 (SEGV)
     Timestamp: Tue 2022-02-01 18:18:33 GMT (1min 30s ago)
  Command Line: /usr/bin/conmon --api-version 1 -c 33eb4a57fdebf699aec953f2059b8ce3e655f104138119d9711aae4bff5ee5ef -u 33eb4a57fdebf699aec953f2059b8ce3e655f104138119d9711aae4bff5ee5ef -r /usr/bin/runc -b /var/lib/containers/storage/overlay-containers/33eb4a57fdebf699aec953f2059b8ce3e655f104138119d9711aae4bff5ee5ef/userdata -p /run/containers/storage/overlay-containers/33eb4a57fdebf699aec953f2059b8ce3e655f104138119d9711aae4bff5ee5ef/userdata/pidfile -n sad_archimedes --exit-dir /run/libpod/exits --full-attach -s -l journald --log-level warning --runtime-arg --log-format=json --runtime-arg --log --runtime-arg=/run/containers/storage/overlay-containers/33eb4a57fdebf699aec953f2059b8ce3e655f104138119d9711aae4bff5ee5ef/userdata/oci-log --conmon-pidfile /run/containers/storage/overlay-containers/33eb4a57fdebf699aec953f2059b8ce3e655f104138119d9711aae4bff5ee5ef/userdata/conmon.pid --exit-command /usr/bin/podman --exit-command-arg --root --exit-command-arg /var/lib/containers/storage --exit-command-arg --runroot --exit-command-arg /run/containers/storage --exit-command-arg --log-level --exit-command-arg warning --exit-command-arg --cgroup-manager --exit-command-arg systemd --exit-command-arg --tmpdir --exit-command-arg /run/libpod --exit-command-arg --cni-config-dir --exit-command-arg /etc/cni/net.d/ --exit-command-arg --network-backend --exit-command-arg cni --exit-command-arg --runtime --exit-command-arg runc --exit-command-arg --storage-driver --exit-command-arg overlay --exit-command-arg --storage-opt --exit-command-arg overlay.mountopt=nodev,metacopy=on --exit-command-arg --events-backend --exit-command-arg journald --exit-command-arg container --exit-command-arg cleanup --exit-command-arg --rm --exit-command-arg 33eb4a57fdebf699aec953f2059b8ce3e655f104138119d9711aae4bff5ee5ef
    Executable: /usr/bin/conmon
 Control Group: /machine.slice/libpod-conmon-33eb4a57fdebf699aec953f2059b8ce3e655f104138119d9711aae4bff5ee5ef.scope
          Unit: libpod-conmon-33eb4a57fdebf699aec953f2059b8ce3e655f104138119d9711aae4bff5ee5ef.scope
         Slice: machine.slice
       Boot ID: e594f814d5684121ba8bfcd66c3a66e3
    Machine ID: 9225d3461f9d43c999f3ae7e7daa2109
      Hostname: lyarwood-laptop.usersys.redhat.com
       Storage: /var/lib/systemd/coredump/core.conmon.0.e594f814d5684121ba8bfcd66c3a66e3.240263.1643739513000000.zst (present)
     Disk Size: 103.6K
       Message: Process 240263 (conmon) of user 0 dumped core.
                
                Found module linux-vdso.so.1 with build-id: dd9f72d7bd194bdaf5264b45aa11ecb7d35396d6
                Found module libgpg-error.so.0 with build-id: 081975d0a3416374b4883b2f1639fd3c9df42390
                Found module ld-linux-x86-64.so.2 with build-id: 524f5518285687402d801b9c3b1e61d435e6780b
                Found module libgcc_s.so.1 with build-id: 88564abce789aa42536da1247a57ff6062d61dcb
                Found module libgcrypt.so.20 with build-id: 711d41580c5a8649a79a8430a985dac3e25b5ba2
                Found module libcap.so.2 with build-id: 0214aa9cc6a8646eb9ec27ab7dda6a2219da6500
                Found module liblz4.so.1 with build-id: fd02c4542a1ce1ad6c503b958447be1c37a4afee
                Found module libzstd.so.1 with build-id: 7984add4b0ab24869c30e19cfba3637ca295dd7b
                Found module liblzma.so.5 with build-id: 7fec53ce7cba9489f130c99cfd3ace82e9dde0ee
                Found module libpcre.so.1 with build-id: fcbdb4731e50f20a47b62341a6841dd616fe495d
                Found module libc.so.6 with build-id: d7e3e120ff5fafc298666927c6475567348588ef
                Found module libseccomp.so.2 with build-id: d9e37e30e12bcd8de6d550fda9c874c8870067d9
                Found module libsystemd.so.0 with build-id: b61753e8440ecf39624b3e37f855941c18a69e9b
                Found module libglib-2.0.so.0 with build-id: de8abd1f0b2c53eea9872862805b922e633f93e8
                Found module conmon with build-id: d185e6305296ab5c3539014320a9f083e49b214c
                Stack trace of thread 240263:
                #0  0x00007fa0d7ff1940 __memchr_avx2 (libc.so.6 + 0x17b940)
                #1  0x00007fa0d80ff5a5 sd_journal_sendv (libsystemd.so.0 + 0x5f5a5)
                #2  0x00007fa0d8101081 sd_journal_sendv_with_location (libsystemd.so.0 + 0x61081)
                #3  0x0000000000405e10 write_journald (conmon + 0x5e10)
                #4  0x0000000000406364 write_to_logs (conmon + 0x6364)
                #5  0x00000000004093e9 read_stdio (conmon + 0x93e9)
                #6  0x0000000000409445 stdio_cb (conmon + 0x9445)
                #7  0x00007fa0d81c705f g_main_context_dispatch (libglib-2.0.so.0 + 0x5505f)
                #8  0x00007fa0d821c298 g_main_context_iterate.constprop.0 (libglib-2.0.so.0 + 0xaa298)
                #9  0x00007fa0d81c6773 g_main_loop_run (libglib-2.0.so.0 + 0x54773)
                #10 0x0000000000404d9a main (conmon + 0x4d9a)
                #11 0x00007fa0d7ea3560 __libc_start_call_main (libc.so.6 + 0x2d560)
                #12 0x00007fa0d7ea360c __libc_start_main@@GLIBC_2.34 (libc.so.6 + 0x2d60c)
                #13 0x0000000000405235 _start (conmon + 0x5235)
                
                Stack trace of thread 240265:
                #0  0x00007fa0d7f7b79f __poll (libc.so.6 + 0x10579f)
                #1  0x00007fa0d821c22c g_main_context_iterate.constprop.0 (libglib-2.0.so.0 + 0xaa22c)
                #2  0x00007fa0d81c4853 g_main_context_iteration (libglib-2.0.so.0 + 0x52853)
                #3  0x00007fa0d81c48a1 glib_worker_main (libglib-2.0.so.0 + 0x528a1)
                #4  0x00007fa0d81f1772 g_thread_proxy (libglib-2.0.so.0 + 0x7f772)
                #5  0x00007fa0d7f03b1a start_thread (libc.so.6 + 0x8db1a)
                #6  0x00007fa0d7f88650 __clone3 (libc.so.6 + 0x112650)
Refusing to dump core to tty (use shell redirection or specify --output).

@jess-sol
Copy link

jess-sol commented Feb 1, 2022

I'm attempting an install of Openshift and see it on the control nodes. The machine-config-daemon-pull service attempts to dump a binary file to stdout (to save as a local file) as well; and hangs due to this issue. I was attempting it in Proxmox, but see the same issue locally with KVM/QEMU.

EDIT: Looks like other projects are starting to work around the issue - openshift/machine-config-operator@cb76ecf

lyarwood added a commit to lyarwood/kubevirtci that referenced this issue Feb 1, 2022
There is currently a known conmon issue [1] when podman based containers
dump large amounts of data to stdout. gocli would previously do this
when copying the kubectl binary to the host machine.

This change replaces that with a simple volume mount workaround when
using podman until the underlying conmon issue is resolved.

[1] containers/conmon#315

Signed-off-by: Lee Yarwood <lyarwood@redhat.com>
lyarwood added a commit to lyarwood/kubevirtci that referenced this issue Feb 1, 2022
There is currently a known conmon issue [1] when podman based containers
dump large amounts of data to stdout. gocli would previously do this
when copying the kubectl binary to the host machine.

This change replaces that with a simple volume mount workaround when
using podman until the underlying conmon issue is resolved.

[1] containers/conmon#315

Signed-off-by: Lee Yarwood <lyarwood@redhat.com>
lyarwood added a commit to lyarwood/kubevirtci that referenced this issue Feb 10, 2022
There is currently a known conmon issue [1] when podman based containers
dump large amounts of data to stdout. gocli would previously do this
when copying the kubectl binary to the host machine.

This change replaces that with a simple volume mount workaround when
using podman until the underlying conmon issue is resolved.

[1] containers/conmon#315

Signed-off-by: Lee Yarwood <lyarwood@redhat.com>
lyarwood added a commit to lyarwood/kubevirtci that referenced this issue Feb 10, 2022
There is currently a known conmon issue [1] when podman based containers
dump large amounts of data to stdout. gocli would previously do this
when copying the kubectl binary to the host machine.

This change replaces that with a simple volume mount workaround when
using podman until the underlying conmon issue is resolved.

[1] containers/conmon#315

Signed-off-by: Lee Yarwood <lyarwood@redhat.com>
lyarwood added a commit to lyarwood/kubevirtci that referenced this issue Feb 10, 2022
There is currently a known conmon issue [1] when podman based containers
dump large amounts of data to stdout. gocli would previously do this
when copying the kubectl binary to the host machine.

This change replaces that with a simple volume mount workaround when
using podman until the underlying conmon issue is resolved.

[1] containers/conmon#315

Signed-off-by: Lee Yarwood <lyarwood@redhat.com>
lyarwood added a commit to lyarwood/kubevirtci that referenced this issue Feb 15, 2022
There is currently a known conmon issue [1] when podman based containers
dump large amounts of data to stdout. gocli would previously do this
when copying the kubectl binary to the host machine.

This change replaces that with a simple volume mount workaround when
using podman until the underlying conmon issue is resolved.

[1] containers/conmon#315

Signed-off-by: Lee Yarwood <lyarwood@redhat.com>
kubevirt-bot pushed a commit to kubevirt/kubevirtci that referenced this issue Feb 15, 2022
There is currently a known conmon issue [1] when podman based containers
dump large amounts of data to stdout. gocli would previously do this
when copying the kubectl binary to the host machine.

This change replaces that with a simple volume mount workaround when
using podman until the underlying conmon issue is resolved.

[1] containers/conmon#315

Signed-off-by: Lee Yarwood <lyarwood@redhat.com>
@haircommander
Copy link
Collaborator

sorry for the delay folks I haven't had the bandwidth to work on this but it is on my list of things to look at

@phemmer
Copy link

phemmer commented Mar 18, 2022

Another report of this.

One thing that seems to be consistent with all the reports is that we're all dumping data from the container over STDOUT. jess-sol's comment seems to be on the right track regarding size. I've tried to narrow down at what size it dies, but it seems to vary. For me it's somewhere around 1860000 bytes. Sometimes using 1860000 will be fine, and sometimes it'll segfault. If I go much higher it consistently segfaults. If I go much lower it consistently works.
I've tried slowing it down (piping into pv -L), and it makes no difference. It's always around that total size when it dies.

Though in my case, I can work around the issue by passing --log-driver=none.

podman run --rm timescaledb-toolkit:1.5.0-38-g26aa60c tar -cz packages >/dev/null
conmon[654652]: segfault at bd3000 ip 00007f0f64230a30 sp 00007ffedbb2bf98 error 4 in libc.so.6[7f0f640e1000+176000]
Code: 66 2e 0f 1f 84 00 00 00 00 00 f3 0f bc c0 48 ff c7 48 01 f8 c5 f8 77 c3 66 90 f3 0f bc c0 48 83 ef df 48 01 f8 c5 f8 77 c3 90 <c5> fd 74 8f 81 00 00 00 c5 fd d7 c1 48 83 ef 80 85 c0 75 9c 83 c2
           PID: 654652 (conmon)
           UID: 1000 (phemmer)
           GID: 1000 (phemmer)
        Signal: 11 (SEGV)
     Timestamp: Fri 2022-03-18 11:57:36 EDT (2min 3s ago)
  Command Line: /usr/bin/conmon --api-version 1 -c 5dd9d592a36f05d2c2128e7648a5a0fdbfea3dcadac62f091339bb88cc49aeae -u 5dd9d592a36f05d2c2128e7648a5a0fdbfea3dcadac62f091339bb88cc49aeae -r /usr/bin/crun -b /home/phemmer/.local/share/containers/storage/overlay-containers/5dd9d592a36f05d2c2128e7648a5a0fdbfea3dcadac62f091339bb88cc49aeae/userdata -p /run/user/1000/overlay-containers/5dd9d592a36f05d2c2128e7648a5a0fdbfea3dcadac62f091339bb88cc49aeae/userdata/pidfile -n elastic_snyder --exit-dir /run/user/1000/libpod/tmp/exits --full-attach -l journald --log-level warning --runtime-arg --log-format=json --runtime-arg --log --runtime-arg=/run/user/1000/overlay-containers/5dd9d592a36f05d2c2128e7648a5a0fdbfea3dcadac62f091339bb88cc49aeae/userdata/oci-log --conmon-pidfile /run/user/1000/overlay-containers/5dd9d592a36f05d2c2128e7648a5a0fdbfea3dcadac62f091339bb88cc49aeae/userdata/conmon.pid --exit-command /usr/bin/podman --exit-command-arg --root --exit-command-arg /home/phemmer/.local/share/containers/storage --exit-command-arg --runroot --exit-command-arg /run/user/1000 --exit-command-arg --log-level --exit-command-arg warning --exit-command-arg --cgroup-manager --exit-command-arg cgroupfs --exit-command-arg --tmpdir --exit-command-arg /run/user/1000/libpod/tmp --exit-command-arg --runtime --exit-command-arg crun --exit-command-arg --storage-driver --exit-command-arg overlay --exit-command-arg --storage-opt --exit-command-arg overlay.mount_program=/usr/bin/fuse-overlayfs --exit-command-arg --storage-opt --exit-command-arg overlay.mount_program=/usr/bin/fuse-overlayfs --exit-command-arg --events-backend --exit-command-arg journald --exit-command-arg container --exit-command-arg cleanup --exit-command-arg --rm --exit-command-arg 5dd9d592a36f05d2c2128e7648a5a0fdbfea3dcadac62f091339bb88cc49aeae
    Executable: /usr/bin/conmon
 Control Group: /user.slice/user-1000.slice/user@1000.service/user.slice/podman-654563.scope
          Unit: user@1000.service
     User Unit: podman-654563.scope
         Slice: user-1000.slice
     Owner UID: 1000 (phemmer)
       Boot ID: 2d4e556481e7464ab5e9b98a6930946d
    Machine ID: c7586e4fd4d741d3bef2e634d2d89c31
      Hostname: whistler
       Storage: /var/lib/systemd/coredump/core.conmon.1000.2d4e556481e7464ab5e9b98a6930946d.654652.1647619056000000.zst (present)
     Disk Size: 234.9K
       Message: Process 654652 (conmon) of user 1000 dumped core.
                
                Found module linux-vdso.so.1 with build-id: fb3d9d23e731a0e438355b1502ea5441c3ae2d6d
                Found module libgpg-error.so.0 with build-id: 081975d0a3416374b4883b2f1639fd3c9df42390
                Found module ld-linux-x86-64.so.2 with build-id: 4e2c368b1240116994daff7f0f248a01f6e53198
                Found module libgcc_s.so.1 with build-id: 88564abce789aa42536da1247a57ff6062d61dcb
                Found module libgcrypt.so.20 with build-id: 711d41580c5a8649a79a8430a985dac3e25b5ba2
                Found module libcap.so.2 with build-id: 0214aa9cc6a8646eb9ec27ab7dda6a2219da6500
                Found module liblz4.so.1 with build-id: fd02c4542a1ce1ad6c503b958447be1c37a4afee
                Found module libzstd.so.1 with build-id: 7984add4b0ab24869c30e19cfba3637ca295dd7b
                Found module liblzma.so.5 with build-id: 7fec53ce7cba9489f130c99cfd3ace82e9dde0ee
                Found module libpcre.so.1 with build-id: fcbdb4731e50f20a47b62341a6841dd616fe495d
                Found module libc.so.6 with build-id: f2837f792a83ba96d761c1b715570411dc0ef584
                Found module libseccomp.so.2 with build-id: d9e37e30e12bcd8de6d550fda9c874c8870067d9
                Found module libsystemd.so.0 with build-id: b61753e8440ecf39624b3e37f855941c18a69e9b
                Found module libglib-2.0.so.0 with build-id: c90c7d2fa86ed0320be8a61af4b8cb2842ca5529
                Found module conmon with build-id: bae1696c801748ce40b14eb400ab730bc6f50b35
                Stack trace of thread 654652:
                #0  0x00007f0f64230a30 __memchr_avx2 (libc.so.6 + 0x17ba30)
                #1  0x00007f0f6433e5a5 sd_journal_sendv (libsystemd.so.0 + 0x5f5a5)
                #2  0x00007f0f64340081 sd_journal_sendv_with_location (libsystemd.so.0 + 0x61081)
                #3  0x0000000000405f37 write_journald (conmon + 0x5f37)
                #4  0x00000000004065b6 write_to_logs (conmon + 0x65b6)
                #5  0x000000000040970b read_stdio (conmon + 0x970b)
                #6  0x0000000000409767 stdio_cb (conmon + 0x9767)
                #7  0x00007f0f6440605f g_main_context_dispatch (libglib-2.0.so.0 + 0x5505f)
                #8  0x00007f0f6445b2a8 g_main_context_iterate.constprop.0 (libglib-2.0.so.0 + 0xaa2a8)
                #9  0x00007f0f64405773 g_main_loop_run (libglib-2.0.so.0 + 0x54773)
                #10 0x0000000000404cfd main (conmon + 0x4cfd)
                #11 0x00007f0f640e2560 __libc_start_call_main (libc.so.6 + 0x2d560)
                #12 0x00007f0f640e260c __libc_start_main@@GLIBC_2.34 (libc.so.6 + 0x2d60c)
                #13 0x00000000004051a5 _start (conmon + 0x51a5)
                
                Stack trace of thread 654654:
                #0  0x00007f0f641ba79f __poll (libc.so.6 + 0x10579f)
                #1  0x00007f0f6445b23c g_main_context_iterate.constprop.0 (libglib-2.0.so.0 + 0xaa23c)
                #2  0x00007f0f64403853 g_main_context_iteration (libglib-2.0.so.0 + 0x52853)
                #3  0x00007f0f644038a1 glib_worker_main (libglib-2.0.so.0 + 0x528a1)
                #4  0x00007f0f64430772 g_thread_proxy (libglib-2.0.so.0 + 0x7f772)
                #5  0x00007f0f64142b1a start_thread (libc.so.6 + 0x8db1a)
                #6  0x00007f0f641c7650 __clone3 (libc.so.6 + 0x112650)

@evverx
Copy link

evverx commented Oct 30, 2022

I started using podman recently to run a bunch of fuzz targets and noticed that those fuzz targets somehow managed to crash conmon on the host from inside their containers (which I would have probably reported privately if this issue wasn't already public for more than half a year).

Anyway as far as I can tell conmon doesn't take null bytes into account when it calculates msg_len while g_strdup_printf stops when it sees them so message can be much shorter than msg_len (depending on where null bytes are embedded):

conmon/src/ctr_logging.c

Lines 296 to 300 in 9e416a2

ssize_t msg_len = line_len + MESSAGE_EQ_LEN + *partial_buf_len;
partial_buf[*partial_buf_len] = '\0';
_cleanup_free_ char *message = g_strdup_printf("MESSAGE=%s%s", partial_buf, buf);
if (writev_buffer_append_segment(dev_null, &bufv, message, msg_len) < 0)
return -1;

It can be reproduced by building conmon with ASan and running the following command:

podman run fedora:36 bash -c 'printf "\0\n"'
=================================================================
==119632==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x602000002079 at pc 0x7f3a69c4d949 bp 0x7fffd73cb6f0 sp 0x7fffd73caea0
READ of size 10 at 0x602000002079 thread T0
    #0 0x7f3a69c4d948 in __interceptor_memchr.part.0 (/lib64/libasan.so.8+0x4d948)
    #1 0x7f3a69102a81 in sd_journal_sendv ../src/libsystemd/sd-journal/journal-send.c:278
    #2 0x7f3a6910845a in sd_journal_sendv_with_location ../src/libsystemd/sd-journal/journal-send.c:580
    #3 0x40e2e9 in write_journald ../src/ctr_logging.c:328
    #4 0x40f3b5 in write_to_logs ../src/ctr_logging.c:230
    #5 0x41bd0b in read_stdio ../src/ctr_stdio.c:138
    #6 0x41bee8 in stdio_cb ../src/ctr_stdio.c:35
    #7 0x7f3a69b18fae in g_main_context_dispatch (/lib64/libglib-2.0.so.0+0x54fae)
    #8 0x7f3a69b6e2c7 in g_main_context_iterate.constprop.0 (/lib64/libglib-2.0.so.0+0xaa2c7)
    #9 0x7f3a69b186ce in g_main_loop_run (/lib64/libglib-2.0.so.0+0x546ce)
    #10 0x40996d in main ../src/conmon.c:453
    #11 0x7f3a67e2950f in __libc_start_call_main (/lib64/libc.so.6+0x2950f)
    #12 0x7f3a67e295c8 in __libc_start_main_impl (/lib64/libc.so.6+0x295c8)
    #13 0x40aa74 in _start (/home/vagrant/conmon/build/conmon+0x40aa74)

0x602000002079 is located 0 bytes to the right of 9-byte region [0x602000002070,0x602000002079)
allocated by thread T0 here:
    #0 0x7f3a69cba68f in __interceptor_malloc (/lib64/libasan.so.8+0xba68f)
    #1 0x7f3a67e81ce7 in __vasprintf_internal (/lib64/libc.so.6+0x81ce7)
    #2 0x5144348b7aad34ff  (<unknown module>)

SUMMARY: AddressSanitizer: heap-buffer-overflow (/lib64/libasan.so.8+0x4d948) in __interceptor_memchr.part.0
Shadow bytes around the buggy address:
  0x0c047fff83b0: fa fa fd fd fa fa fd fd fa fa fd fa fa fa fd fa
  0x0c047fff83c0: fa fa 00 00 fa fa fd fd fa fa 00 fa fa fa 00 fa
  0x0c047fff83d0: fa fa 00 fa fa fa 07 fa fa fa 00 fa fa fa 00 fa
  0x0c047fff83e0: fa fa 00 fa fa fa 04 fa fa fa 00 00 fa fa 04 fa
  0x0c047fff83f0: fa fa 00 00 fa fa fd fd fa fa fd fa fa fa fd fa
=>0x0c047fff8400: fa fa fd fa fa fa fd fa fa fa 00 00 fa fa 00[01]
  0x0c047fff8410: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c047fff8420: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c047fff8430: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c047fff8440: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c047fff8450: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==119632==ABORTING

giuseppe added a commit to giuseppe/conmon that referenced this issue Nov 3, 2022
msg_len doesn't take into account NUL bytes that could be present in
the buffer, while g_strdup_printf("MESSAGE=%s%s", partial_buf, buf)
does and would stop writing the string once it finds a NUL byte.  This
would lead to read after the buffer end.

Build the message string manually so that the calculated length
reflects the buffer size.

Closes: containers#315

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
@giuseppe
Copy link
Member

giuseppe commented Nov 3, 2022

I've opened a PR here: #361

giuseppe added a commit to giuseppe/conmon that referenced this issue Nov 3, 2022
msg_len doesn't take into account NUL bytes that could be present in
the buffer, while g_strdup_printf("MESSAGE=%s%s", partial_buf, buf)
does and would stop writing the string once it finds a NUL byte.  This
would lead to read after the buffer end.

Build the message string manually so that the calculated length
reflects the buffer size.

Closes: containers#315

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
haircommander pushed a commit that referenced this issue Nov 3, 2022
msg_len doesn't take into account NUL bytes that could be present in
the buffer, while g_strdup_printf("MESSAGE=%s%s", partial_buf, buf)
does and would stop writing the string once it finds a NUL byte.  This
would lead to read after the buffer end.

Build the message string manually so that the calculated length
reflects the buffer size.

Closes: #315

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
@nickthetait
Copy link

I am trying to determine the security impact of this flaw. Could anyone help by answering these questions and/or confirming my assumptions

  • It causes a crash in conmon, but will that service come back on its own? About how long does that take?
  • So this would halt monitoring of other containers (outside the one where this issue was triggered)? But these containers at least continue operating correctly right?
  • An attacker could trigger this remotely (in a public cloud service for example) right? They would at least need an account with the service provider?
  • Can be triggered easily by sending null bytes
  • User interaction is not required

@evverx
Copy link

evverx commented Nov 5, 2022

It causes a crash in conmon, but will that service come back on its own? About how long does that take?

I don't think conmon is restarted by default. At least in my case it never came back by itself.

So this would halt monitoring of other containers (outside the one where this issue was triggered)?

I don't think it would. One container can crash only its own conmon as far as can tell. In my case I ran a bunch of containers doing basically the same thing and about 30% of them went down along with their conmons.

But these containers at least continue operating correctly right?

I think it depends on what they do. If they try to write data to file descriptors that are supposed to be managed by conmon they receive -EPIPE and most likely go down. All the logs from those containers get lost as well. It isn't possible to manage them normally and when they are finally stopped they can't be started normally because containers "must be in Created or Stopped state to be started: container state improper".

An attacker could trigger this remotely (in a public cloud service for example) right?

If there is a way to supply images or Dockerfiles used to build and run containers there I think it's possible to trigger that remotely.

They would at least need an account with the service provider?

I think whatever runs those containers needs to be authenticated one way or another. The question would be where all the images and Dockerfiles come from. If they are pulled from registries or say GitHub it isn't necessary to have a cloud service account. For example there are testing services where Dockerfiles are provided by random GitHub users and those users don't have to have access to the testing infrastructure directly.

Can be triggered easily by sending null bytes

Yes. The easiest (but not the fastest) way to trigger it reliably without ASan would be to cat /dev/random and wait for conmon to crash eventually.

User interaction is not required

I think it depends. If containers are pulled/built and run fully automatically user interaction isn't required.

@evverx
Copy link

evverx commented Nov 5, 2022

But these containers at least continue operating correctly right?

I think it depends on what they do. If they try to write data to file descriptors that are supposed to be managed by conmon they receive -EPIPE and most likely go down. All the logs from those containers get lost as well. It isn't possible to manage them normally and when they are finally stopped they can't be started normally because containers "must be in Created or Stopped state to be started: container state improper".

I seem to have misunderstood the question. I was talking about "crashing" containers (with my use case where a bunch of homogeneous containers do the same thing). The question seems to have been about the other containers (running in parallel and doing something else). Those other containers should be fine as far as I can tell. Sorry about that.

@rhatdan
Copy link
Member

rhatdan commented Nov 7, 2022

The risk is that Podman would loose track of the container processes, so it would not easily be able to kill the container.

@mheon
Copy link
Member

mheon commented Nov 7, 2022

Manual operations would still work (podman kill would hit the OCI runtime and kill the container, for example), but we would not be able to retrieve exit code. More serious is that we lose the ability to detect if the container exited on its own; as such, things like restart policy will not fire if the container exits.

@nickthetait
Copy link

Thanks all, that info helps narrow it down. From my reading, the effects are limited to a single "attacking" container. So here is how I would rate the impact 3.5/CVSS:3.1/AV:N/AC:L/PR:L/UI:R/S:U/C:N/I:N/A:L

@evverx
Copy link

evverx commented Nov 8, 2022

I think all in all it's more of a reliability issue in the sense that legitimate containers that happen to spew zero bytes here and there would go down and the conmon crash would prevent them from being restarted normally (and in that case A would be H). That same crash would prevent "malicious" containers from being restarted for eternity (which is a good thing I think) so I think the security impact is low and the vector seems to reflect that.

@evverx
Copy link

evverx commented Nov 8, 2022

Just to clarify as far as I can tell the worst thing that malicious containers can do is to dodge timeouts, restarts and so on and go about their business until they are somehow detected and stopped manually with, say, podman stop ....

@mheon
Copy link
Member

mheon commented Nov 9, 2022

Worth noting that this can potentially be used to obfuscate other malicious actions in the container - by killing Conmon, all logging of the container is also stopped.

@evverx
Copy link

evverx commented Nov 13, 2022

all logging of the container is also stopped.

On the bright side when conmon crashes systemd-coredump should collect full backtraces and coredumps and put them in the journal so it doesn't seem to be possible to roll out malicious containers stealthily in one fell swoop. It's kind of possible to hide those backtraces by bringing down systemd-coredump itself from inside containers before crashing conmon by triggering bugs like systemd/systemd@076b807 but some cryptic messages like

conmon[654652]: segfault at bd3000 ip 00007f0f64230a30 sp 00007ffedbb2bf98 error 4

should still end up in kmsg and the journal and make it possible to at least detect that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

10 participants