Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify svclb daemonset #10954

Merged
merged 2 commits into from
Oct 11, 2024
Merged

Simplify svclb daemonset #10954

merged 2 commits into from
Oct 11, 2024

Conversation

manuelbuil
Copy link
Contributor

@manuelbuil manuelbuil commented Sep 26, 2024

Proposed Changes

Simplify svclb by replacing the klipper-lb image with a simpler one and remove extra permissions, etc required by the daemonset to be able to run iptables and forward traffic

Types of Changes

New feature

Verification

Create a service of type LoadBalancer and verify that:
1 - It gets and externalIP
2 - The service is reachable via that IP

Testing

Linked Issues

#11077

User-Facing Change

Stop using klipper-lb as the image for svclb. Replace it with a simple busybox which just sleeps

Further Comments

@manuelbuil manuelbuil requested a review from a team as a code owner September 26, 2024 17:39
@brandond
Copy link
Member

brandond commented Sep 26, 2024

If nothing else, the daemonset currently handles port conflicts by ensuring that the LB pods are not scheduled to nodes where the port is currently in use, which in turn ensures that the loadbalancer does not advertise those node IPs. We would need to do the same via some other mechanism if we remove use of the daemonset.

I'm also curious if this works with kube-proxy replacements from other CNIs.

@manuelbuil
Copy link
Contributor Author

If nothing else, the daemonset currently handles port conflicts by ensuring that the LB pods are not scheduled to nodes where the port is currently in use, which in turn ensures that the loadbalancer does not advertise those node IPs. We would need to do the same via some other mechanism if we remove use of the daemonset.

Thanks Brad for the feedback. That's a good point, kube-proxy will not complain if the port is already used by other k8s service or by the node. I wonder if we could leave the daemonset with a dummy image (e.g. busybox) just as a mechanism to prevent port duplication. The alternatives to that seem all pretty complicated.

I'm also curious if this works with kube-proxy replacements from other CNIs.

Yes, Cilium can take care of creating the infrastructure for loadbalancer services.

@manuelbuil manuelbuil changed the title Remove svclb daemonset Simplify svclb daemonset Sep 27, 2024
@@ -55,7 +55,7 @@ const (
)

var (
DefaultLBImage = "rancher/klipper-lb:v0.4.9"
DefaultLBImage = "registry.suse.com/bci/bci-busybox:latest"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't use the latest tag, and this needs to come from something in the rancher/mirrored- images alongside everything else.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious, is this a requirement from CNCF?

Copy link
Member

@brandond brandond Oct 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, but not using latest is a best practice (and required for airgap), and moving off docker hub to registry.suse.com would be a bigger deal as existing users may expect all of our images to be under the docker.io/rancher namespace, and we'd want to coordinate that change much further out.

Copy link
Member

@brandond brandond left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to see tests that ensure that this does not change current behavior with regards to internalTrafficPolicy, externalTrafficPolicy, and lb source ranges.

@manuelbuil
Copy link
Contributor Author

I'd like to see tests that ensure that this does not change current behavior with regards to internalTrafficPolicy, externalTrafficPolicy, and lb source ranges.

#10972

@brandond
Copy link
Member

brandond commented Oct 3, 2024

As we discussed in standup, I'm also curious if having ServiceLB set VIP: Proxy in the LoadBalancer status field, so that traffic actually hits the pod rules, allows the source address through as desired. If so, then the issues folks have been reporting are due to kube-proxy and we might consider keeping the svclb iptables rules around.

@@ -174,6 +176,13 @@ func (k *k3s) onChangeEndpointSlice(key string, eps *discovery.EndpointSlice) (*
return eps, nil
}

// onChangeService handles changes to Services. This is used to ensure that changes in the Service are
// reflected in the LoadBalancer status (e.g. changing to externalTrafficPolicy=Local which should remove externalIPs)
func (k *k3s) onChangeService(key string, svc *core.Service) (*core.Service, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious why this is necessary. This would appear to duplicate the OnChange handler in the core Kubernetes cloud-controller code base, which calls EnsureLoadBalancer or UpdateLoadBalancer when the Service object changes. Is it not calling those handlers on the right events to see these changes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Imagine I have a cluster with two nodes and an HTTP server in one of the nodes. That server can be accessed by a service of type LoadBalancer. It'll have two externalIPs: node1-IP and node2-IP.

Now, I change the field externalTrafficPolicy from Cluster to Local. That should reduce the number of externalIPs to only 1.

When I do that, I see that EnsureLoadBalancer gets called, which redeploys the daemonset. In the current code, the daemonset manifest gets updated depending on the externalTrafficPolicy value. However, in my PR, that part of the manifest is removed.

To update the service from 2 externalIPs to 1, the function updateStatus must be called: /~https://github.com/k3s-io/k3s/blob/master/pkg/cloudprovider/servicelb.go#L231. In the current code, it gets called via the onChangePod function because the svclb daemonset changed. However, in my PR, this is not happening. That's why I need a function to add the "item" to the workqueue, e.g. k.workqueue.Add(eps.Namespace + "/" + serviceName), so that updateStatus gets called.

An alternative to this would be to add a "dummy" label/annotation in the svclb daemonset to force onChangePod function to get called when externalTrafficPolicy changes. Do you know if adding a new watcher on services is expensive in terms of resource consumption?

Copy link
Member

@brandond brandond Oct 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not super expensive but it does mean that both the built-in CCM and Wrangler are now both watching Services. Adding a dummy annotation so that the behavior of the daemonset is retained might be more efficient than adding another watch?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll add the label to the ds. That way, this new alternative, is more similar to what we are currently doing

Copy link

codecov bot commented Oct 7, 2024

Codecov Report

Attention: Patch coverage is 71.42857% with 2 lines in your changes missing coverage. Please review.

Project coverage is 43.90%. Comparing base (7552203) to head (1befd65).
Report is 3 commits behind head on master.

Files with missing lines Patch % Lines
pkg/cloudprovider/servicelb.go 71.42% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master   #10954      +/-   ##
==========================================
- Coverage   49.93%   43.90%   -6.04%     
==========================================
  Files         178      178              
  Lines       14816    14764      -52     
==========================================
- Hits         7399     6482     -917     
- Misses       6069     7080    +1011     
+ Partials     1348     1202     -146     
Flag Coverage Δ
e2etests 36.13% <71.42%> (-9.91%) ⬇️
inttests 36.74% <71.42%> (-0.13%) ⬇️
unittests 13.59% <0.00%> (+0.04%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

brandond
brandond previously approved these changes Oct 8, 2024
brandond
brandond previously approved these changes Oct 9, 2024
Signed-off-by: manuelbuil <mbuil@suse.com>
Signed-off-by: manuelbuil <mbuil@suse.com>
@manuelbuil manuelbuil merged commit 660c605 into k3s-io:master Oct 11, 2024
31 checks passed
@manuelbuil manuelbuil deleted the removesvclbds branch October 11, 2024 08:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants