Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Not all metrics are deleted after removing ingress object from kubernetes cluster #10825

Open
dkhachyan opened this issue Jan 5, 2024 · 4 comments
Labels
lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. needs-kind Indicates a PR lacks a `kind/foo` label and requires one. needs-priority needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one.

Comments

@dkhachyan
Copy link

What happened:

After creating ingress object added host metrics with status="404" and with empty ingress and namespace labels.
After removing this ingress object such metrics not being deleted

What you expected to happen:

Ingress controller ignore metric event from new host until nginx config reload complete.

NGINX Ingress controller version (exec into the pod and run nginx-ingress-controller --version.): v1.9.3

Kubernetes version (use kubectl version): v1.27.5

Environment:

  • Cloud provider or hardware configuration: baremetal

  • OS (e.g. from /etc/os-release): Ubuntu 20.04.5 LTS

  • Kernel (e.g. uname -a): 5.15.0-43-generic

  • How was the ingress-nginx-controller installed: Kubespray

How to reproduce this issue:

  • Create a lot of ingress objects(2000+)
  • Create one more ingress object
  • Perform requests to new ingress host
  • Get status 404(nginx config not ready yet)
  • Wait for status 200(nginx reload complete)
  • Scrape controller metrics and check that metrics with status="404" label don`t have empty ingress and namespace labels
  • Remove ingress object
  • Scrape controller metrics
  • Check that metrics with status="404" label and empty namespace and ingress labels haven`t been removed

Anything else we need to know:

In very large environments during nginx config reload process ingress controller exposes metrics with status="404" label and
empty namespace="" and ingress="" labels what causes that such metrics cat not be deleted.

@dkhachyan dkhachyan added the kind/bug Categorizes issue or PR as related to a bug. label Jan 5, 2024
@k8s-ci-robot k8s-ci-robot added the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Jan 5, 2024
@k8s-ci-robot
Copy link
Contributor

This issue is currently awaiting triage.

If Ingress contributors determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@longwuyuan
Copy link
Contributor

the timeseries data is retained but new data is not appended. so does not seem like a bug. wait for other comments.

/remove-kind bug

@k8s-ci-robot k8s-ci-robot added needs-kind Indicates a PR lacks a `kind/foo` label and requires one. and removed kind/bug Categorizes issue or PR as related to a bug. labels Jan 7, 2024
@dkhachyan
Copy link
Author

Yes the new data is not appended but this metric will remain after deleting ingress object because of empty ingress and namespace labels

ns, ok := labels["namespace"]

I think the problem is here

n.metricCollector.SetHosts(hosts)

host list was updated before the new host was added to the nginx config

May be I should add something like this

if sc.metricsPerHost && !sc.hosts.Has(stats.Host) {

if sc.metricsPerHost {
	if !sc.hosts.Has(stats.Host) {
		klog.V(3).InfoS("Skipping metric for host not being served", "host", stats.Host)
		continue
	}
	if stats.Host != "" {
		if stats.Namespace == "" || stats.Ingress == "" {
			continue
		}
}

Copy link

github-actions bot commented Feb 8, 2024

This is stale, but we won't close it automatically, just bare in mind the maintainers may be busy with other tasks and will reach your issue ASAP. If you have any question or request to prioritize this, please reach #ingress-nginx-dev on Kubernetes Slack.

@github-actions github-actions bot added the lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. label Feb 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. needs-kind Indicates a PR lacks a `kind/foo` label and requires one. needs-priority needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one.
Projects
Development

No branches or pull requests

3 participants