-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix podman stop
and podman run --rmi
#23670
Conversation
Since we were talking about this please add a Fixes #22852 and Fixes for the Jira issue to your commit |
test/system/030-run.bats
Outdated
# Try again with a detached container and verify it works | ||
cname=c_$(safename) | ||
run_podman run -d --name $cname --rmi $NONLOCAL_IMAGE /bin/true | ||
run_podman wait $cname |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will not work, see my latest wait changes. podman wait waits for stop not removal so this will flake a lot.
And you cannot even wait for removing as the image is removed after the ctr is removed so there would still be a race...
I suggest you a sleep loop to wait until image exists command fails and then have some timeout in there
pkg/domain/infra/abi/containers.go
Outdated
// We still need to clean up since we do not know if the other cleanup process is successful | ||
if c.AutoRemove() && (errors.Is(err, define.ErrNoSuchCtr) || errors.Is(err, define.ErrCtrRemoved)) { | ||
return nil | ||
if c.AutoRemove() && !c.ShouldRestart(ctx) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do not call c.ShouldRestart(ctx) here, a container that is stopped by hand cannot by definition restart again.
pkg/domain/infra/abi/containers.go
Outdated
if (errors.Is(err, define.ErrNoSuchCtr) || errors.Is(err, define.ErrCtrRemoved)) { | ||
return nil | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is now wrong, we only ignore the error in case of auto remove removed the ctr in the background.
I guess there is harm harm ignoring it here though but is is a change of logic that isn't needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think ignoring is still safer - there's an inherent race around the cleanup here as we unlock after Stop(), so a podman rm
can sneak in before cleanup.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it can yes sure, but is podman stop ctr
and then ctr is removed in the middle not an error to the end user? Unless the ctr is configured for auto removed I would argue this will be a user error if they stop/remove in parallel IMO and something they should notice. But I can accept either way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it's an error case since the stop actually happened, and the only thing we're erroring on is cleanup - so we did what was asked, and only the cleanup logic is throwing an error.
4fcbb55
to
4122abb
Compare
Addresses: https://issues.redhat.com/browse/RHEL-39513 |
ba1d729
to
79a3dba
Compare
@containers/podman-maintainers This is ready |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, some suggestions for the test
This started off as an attempt to make `podman stop` on a container started with `--rm` actually remove the container, instead of just cleaning it up and waiting for the cleanup process to finish the removal. In the process, I realized that `podman run --rmi` was rather broken. It was only done as part of the Podman CLI, not the cleanup process (meaning it only worked with attached containers) and the way it was wired meant that I was fairly confident that it wouldn't work if I did a `podman stop` on an attached container run with `--rmi`. I rewired it to use the same mechanism that `podman run --rm` uses, so it should be a lot more durable now, and I also wired it into `podman inspect` so you can tell that a container will remove its image. Tests have been added for the changes to `podman run --rmi`. No tests for `stop` on a `run --rm` container as that would be racy. Fixes containers#22852 Fixes RHEL-39513 Signed-off-by: Matt Heon <mheon@redhat.com>
79a3dba
to
458ba5a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tests LGTM
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: edsantiago, mheon The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
return fmt.Errorf("removing container %s image %s: %w", c.ID(), imageName, mErr) | ||
} | ||
} | ||
} else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like if you specify AutoRemoteImage, we are no longer cleaning up the containers?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removing the image implies removing the container, and removing the container also unmounts, cleans up the network, etc, so we should be OK.
/lgtm |
43fe3eb
into
containers:main
This started off as an attempt to make
podman stop
on a container started with--rm
actually remove the container, instead of just cleaning it up and waiting for the cleanup process to finish the removal.In the process, I realized that
podman run --rmi
was rather broken. It was only done as part of the Podman CLI, not the cleanup process (meaning it only worked with attached containers) and the way it was wired meant that I was fairly confident that it wouldn't work if I did apodman stop
on an attached container run with--rmi
. I rewired it to use the same mechanism thatpodman run --rm
uses, so it should be a lot more durable now, and I also wired it intopodman inspect
so you can tell that a container will remove its image.Tests have been added for the changes to
podman run --rmi
. No tests forstop
on arun --rm
container as that would be racy.Does this PR introduce a user-facing change?