-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pod: add exit policies #13859
pod: add exit policies #13859
Conversation
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: vrothberg The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@cdoern PTAL |
I don't think this works with |
Actually, I take that back, |
We can potentially avoid the workqueue bits by doing this in a |
Though, that requires that we add it to every function that can stop a single container, which could be complicated |
See #13859 (comment). There are deadlock all over the place. We have the choice to either update all call sites of |
Note my use of the capitalized |
How would you call |
Given e2e tests are also barking, I think the behavior should really be opt-in for ordinary pods but be turned on for |
I can only speak for the
I think making this configurable is a good idea in case users have problems with the new bahavior but IMO the default should be to stop the infra. I see no reasons why they have to keep running since they do nothing. |
I thought the use case we added or were adding was pods go away when the last container exits. They can start without a container, but if a container is added and removed the infra container goes away. |
The infra container will now be stopped but the pod will continue to exist. We can add a The question now is whether we want to auto-stop the infra container for ordinary pods. There are certain tests breaking that created a file in the pod's mount NS:
The second one is now breaking as the mount NS is destroyed when we stop the infra container. |
Can we change the test to do:
Second container should restart the pod, correct? |
@rhatdan I feel rather strongly to make the behavior configurable. If the change already breaks the tests there is no get-free-out-of-jail card if it broke users. I started working on it yesterday. |
@rhatdan PTAL |
cmd/podman/pods/create.go
Outdated
@@ -72,6 +72,9 @@ func init() { | |||
flags.StringVarP(&createOptions.Name, nameFlagName, "n", "", "Assign a name to the pod") | |||
_ = createCommand.RegisterFlagCompletionFunc(nameFlagName, completion.AutocompleteNone) | |||
|
|||
flags.StringVarP(&createOptions.ExitPolicy, "exit-policy", "", "continue", "Behaviour when the last container exits") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is fine for now, but this should be set in containers.conf.
LGTM |
I'm still unsure how we guarantee all pending jobs have finished before the runtime shuts down and Podman exits |
All kinds of red unhappy test and you need a rebase @vrothberg |
Moving to WIP for now to do the plumbing for containers.conf. |
Add a new `pod_exit_policy` field to the containers.conf's engine table. A pod's exit policy determines the behaviour when the last container of a pod exits. Required-in: containers/podman/pull/13859 Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
Add a new `pod_exit_policy` field to the containers.conf's engine table. A pod's exit policy determines the behaviour when the last container of a pod exits. Required-in: containers/podman/pull/13859 Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
-> containers/common#1017 to move the definitions and parsing directly to c/common. |
@vrothberg Just a small doubt why do we need extra Ref: https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/#pod-phase |
Containers follow restart policy not the Pod. Theoretically if I add a restart=always container to a Pod, the container should restart even if the Pod is set to |
@rhatdan ah i see but shouldn't containers inherit the Afaik In kubernetes construct i don't think its possible to add a restart policy for a specific container , containers always inherit restartPolicy from pod , some reference here https://pkg.go.dev/k8s.io/api/core/v1#Container and https://pkg.go.dev/k8s.io/api/core/v1#PodSpec Just want to make sure that a standard spec behaves in a similar manner between podman and k8s other than that I think this should be fine. So yeah i guess its fine. Edit: Nvm not a blocker. |
Yes, that is correct. That is why the tests set the restart policy to "onFailure" to actually trigger the "stop" exit policy when being run by It is also a topic that I wanted to reflect on next week: should the default restart policy really be "always"? I see why K8s does that but should Podman do that as well? |
It won't without this patch as the infra container would continue to run. Hence, we need to stop the pod once all containers (other than the infra) are done. However, we cannot do that unconditionally as it would break backwards compat to how |
Required for using the newly added pod exit policies. Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
Add the notion of an "exit policy" to a pod. This policy controls the behaviour when the last container of pod exits. Initially, there are two policies: - "continue" : the pod continues running. This is the default policy when creating a pod. - "stop" : stop the pod when the last container exits. This is the default behaviour for `play kube`. In order to implement the deferred stop of a pod, add a worker queue to the libpod runtime. The queue will pick up work items and in this case helps resolve dead locks that would otherwise occur if we attempted to stop a pod during container cleanup. Note that the default restart policy of `play kube` is "Always". Hence, in order to really solve containers#13464, the YAML files must set a custom restart policy; the tests use "OnFailure". Fixes: containers#13464 Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
Done. |
@mheon PTanotherL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@@ -72,6 +72,10 @@ func init() { | |||
flags.StringVarP(&createOptions.Name, nameFlagName, "n", "", "Assign a name to the pod") | |||
_ = createCommand.RegisterFlagCompletionFunc(nameFlagName, completion.AutocompleteNone) | |||
|
|||
policyFlag := "exit-policy" | |||
flags.StringVarP(&createOptions.ExitPolicy, policyFlag, "", string(containerConfig.Engine.PodExitPolicy), "Behaviour when the last container exits") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
flags.StringVarP(&createOptions.ExitPolicy, policyFlag, "", string(containerConfig.Engine.PodExitPolicy), "Behaviour when the last container exits") | |
flags.StringVar(&createOptions.ExitPolicy, policyFlag, string(containerConfig.Engine.PodExitPolicy), "Behaviour when the last container exits") |
small nit in case you have to repush
LGTM on my end |
Nice, @rhatdan, want to merge? |
/lgtm |
Add the notion of an "exit policy" to a pod. This policy controls the
behaviour when the last container of pod exits. Initially, there are
two policies:
"continue" : the pod continues running. This is the default policy
when creating a pod.
"stop" : stop the pod when the last container exits. This is the
default behaviour for
play kube
.In order to implement the deferred stop of a pod, add a worker queue to
the libpod runtime. The queue will pick up work items and in this case
helps resolve dead locks that would otherwise occur if we attempted to
stop a pod during container cleanup.
Note that the default restart policy of
play kube
is "Always". Hence,in order to really solve #13464, the YAML files must set a custom
restart policy; the tests use "OnFailure".
Fixes: #13464
Signed-off-by: Valentin Rothberg vrothberg@redhat.com