Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add k8s.container.status.state metric #1784

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

povilasv
Copy link
Contributor

@povilasv povilasv commented Jan 22, 2025

Fixes #1672

Changes

Adds k8s.container.status.state metric, it would allow us to alert and monitor containers in not ready state.

I'm still not sure if this should be multiple different metrics or a single one 🤔

The current problems, with single metric:

  • Running state doesn't have a reason, so we would set reason to empty.
  • Waiting state and Terminated state have different sets of reasons:

Waiting state - "ContainerCreating", "CrashLoopBackOff", "CreateContainerConfigError", "ErrImagePull", "ImagePullBackOff"

Terminated state - "OOMKilled", "Completed", "Error", "ContainerCannotRun"

Alternative approach would be to do what KSM does:

  • k8s.container.status.state metric without reason attribute.
  • k8s.container.status.waiting_reason metric for waiting reason enum.
  • k8s.container.status.terminated_reason metric for terminated reason enum.

This is not intended to merge, I would appreciate any feedback to see what we want to do here.

Merge requirement checklist

@povilasv povilasv force-pushed the container-status branch 4 times, most recently from 8de6ad2 to 726203e Compare January 22, 2025 07:47
@povilasv povilasv marked this pull request as ready for review January 22, 2025 07:48
@povilasv povilasv requested review from a team as code owners January 22, 2025 07:48
brief: ""
instrument: updowncounter
unit: ""
attributes:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to name these as k8s.container.status.state and k8s.container.status.reason so as to have meaningful names in case we want to re-use them in other signals like Entities (k8s.container.reason does not make a lot of sense).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed the attributes, thanks!

- ref: k8s.container.state
requirement_level: required
- ref: k8s.container.reason
requirement_level: required
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we have it as required, the problem is that states which do not have a reason won't have it provided. Maybe that could be left "unknown" or just empty?

@open-telemetry/semconv-k8s-approvers @open-telemetry/specs-semconv-approvers thoughts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed the "required" for now, but would appreciate suggestions what we want to do here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Development

Successfully merging this pull request may close these issues.

Add k8s.container.status.waiting metric to semantic conventions
2 participants