Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add container error message to ContainerState #16806

Conversation

jakecorrenti
Copy link
Member

@jakecorrenti jakecorrenti commented Dec 11, 2022

This change aims to store an error message to the ContainerState struct with the last known error from the Start, StartAndAttach, and Stop OCI Runtime functions.

The goal was to act in accordance with Docker's behavior.

Example:

$ podman create -t --name demo alpine efcho Hello World
$ podman start demo
$ podman inspect -f '{{ .State.Error }}' demo
crun: executable file `efcho` not found in $PATH: No such file or directory: OCI runtime attempted to invoke a command that was not found

Fixes: #13729

Signed-off-by: Jake Correnti jakecorrenti+github@proton.me

Does this PR introduce a user-facing change?

Adds a feature for the user to see why a container failed when inspecting it.

@openshift-ci openshift-ci bot added release-note do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. labels Dec 11, 2022
@jakecorrenti
Copy link
Member Author

I still need to figure out the best way to test all of the different ways an error could be saved to the ContainerState and make sure the message is correct in podman inspect.

I also need to verify that everywhere I have saveContainerError is correct and there aren't any that are unnecessary/missing (I continue to go back and forth with myself on where this should be added).

@jakecorrenti jakecorrenti force-pushed the podman-inspect-add-error-info branch 2 times, most recently from 9920290 to 80887fc Compare December 11, 2022 15:16
Copy link
Member

@vrothberg vrothberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's probably easier to return a named error and use a defer to save the error. For instance

defer func() {
if finalErr != nil {
   saveContainerError(...)
}
}()

Any chance we move saving the error further up the call stack? The higher we are, the less locations we have to update (and will probably miss in the future).

@mheon PTAL

@jakecorrenti
Copy link
Member Author

I definitely agree the named error would be easier.

The issue I was having with the location of the error being saved came down to the container API. To the best of my knowledge, I don't actually get access to the container state until this point in the call stack, unless I was to modify the container API (which I tried to avoid). I accept the fact there's a good chance I'm wrong, though.

@mheon
Copy link
Member

mheon commented Dec 12, 2022

I'm OK with leaving the error where it is... Moving out of Libpod is just going to start introducing substantial complexity (having the ability to update the DB outside of libpod is problematic).

libpod/container_api.go Outdated Show resolved Hide resolved
@jakecorrenti jakecorrenti force-pushed the podman-inspect-add-error-info branch 2 times, most recently from 775fc9c to c5a5d09 Compare December 13, 2022 00:08
@jakecorrenti jakecorrenti changed the title [WIP] Add container error message to ContainerState Add container error message to ContainerState Dec 13, 2022
@jakecorrenti jakecorrenti marked this pull request as ready for review December 13, 2022 04:29
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Dec 13, 2022
Copy link
Member

@vrothberg vrothberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes LGTM.

Note that the diff would be even smaller when calling the named error finalError or something that is not yet used in the function bodies. Than, they can continue to return err.

@mheon PTAL

test/e2e/inspect_test.go Show resolved Hide resolved
@jakecorrenti jakecorrenti force-pushed the podman-inspect-add-error-info branch 3 times, most recently from d195045 to e9209c9 Compare December 13, 2022 22:16
@TomSweeneyRedHat
Copy link
Member

All kinds of red unhappy tests

@jakecorrenti jakecorrenti force-pushed the podman-inspect-add-error-info branch from e9209c9 to 03bc0c8 Compare December 31, 2022 04:47
@jakecorrenti
Copy link
Member Author

Is it possible to restart the tests? Looks like some unrelated failure with a missing catatonit executable

defer func() {
if finalErr != nil {
_ = saveContainerError(c, finalErr)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you logrus.Debug this error if not nil?

func (c *Container) StartAndAttach(ctx context.Context, streams *define.AttachStreams, keys string, resize <-chan resize.TerminalSize, recursive bool) (retChan <-chan error, finalErr error) {
defer func() {
if finalErr != nil {
_ = saveContainerError(c, finalErr)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you logrus.Debug this error if not nil?

func (c *Container) StopWithTimeout(timeout uint) (finalErr error) {
defer func() {
if finalErr != nil {
_ = saveContainerError(c, finalErr)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you logrus.Debug this error if not nil?

This change aims to store an error message to the ContainerState struct
with the last known error from the Start, StartAndAttach, and Stop OCI
Runtime functions.

The goal was to act in accordance with Docker's behavior.

Fixes: containers#13729

Signed-off-by: Jake Correnti <jakecorrenti+github@proton.me>
@jakecorrenti jakecorrenti force-pushed the podman-inspect-add-error-info branch from 03bc0c8 to df02cb5 Compare January 3, 2023 18:21
@rhatdan
Copy link
Member

rhatdan commented Jan 3, 2023

/lgtm
/hold

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 3, 2023
@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Jan 3, 2023
@rhatdan
Copy link
Member

rhatdan commented Jan 5, 2023

/approve

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jan 5, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jakecorrenti, rhatdan

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jan 5, 2023
@jakecorrenti
Copy link
Member Author

/hold cancel

@openshift-ci openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 5, 2023
@openshift-merge-robot openshift-merge-robot merged commit b7314bd into containers:main Jan 5, 2023
@github-actions github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Sep 16, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 16, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. release-note
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Cannot determine if a container was unable to start
6 participants