✨ APIExportEndpointSlice reconciliation #2432

fgiloux · 2022-11-30T07:52:56Z

Summary

This PR introduces some reconciliation logic for the population of an endpoint per shard in the status of the apiexportendpointslices. This will replace what was directly populated in the status of apiexports.

Background

Conditions are also added in this PR to the APIExportEndpointSlice status.

Filtering will be introduced in a subsequent PR.

Related issue(s)

~~Fixes~~ Contributes to #2332

pkg/apis/apis/v1alpha1/types_apiexport.go

pkg/reconciler/apis/apiexportendpointslice/apiexportendpointslice_controller.go

ncdc · 2022-11-30T16:30:56Z

pkg/server/server.go

@@ -456,6 +456,12 @@ func (s *Server) Run(ctx context.Context) error {
 		}
 	}

+	if s.Options.Cache.Enabled && (s.Options.Controllers.EnableAll || enabled.Has("apiexportendpointslice")) {


So this doesn't work if the cache server is disabled? What is the fallback option for a service provider wanting to know which URLs to use?

Yes. I should have raised that. Here are my views:

sharded environment means cache server

the construct (APIExportEndpointSlice) is of little value for non-sharded environment

it is not a good UX for service providers to have a different API for sharded and non-sharded environments. Most probably they want to develop their service once and have it working for both

I would propose that in a non-sharded / no cache server environment informers for the local instance are passed to the controller instead of the ones for the cache server, e.g.:

c, err := apiexportendpointslice.NewController( s.KcpSharedInformerFactory.Apis().V1alpha1().APIExportEndpointSlices(), // ClusterWorkspaceShards and APIExports get retrieved from cache server s.CacheKcpSharedInformerFactory.Tenancy().V1alpha1().ClusterWorkspaceShards(), s.CacheKcpSharedInformerFactory.Apis().V1alpha1().APIExports(), kcpClusterClient, )

vs

c, err := apiexportendpointslice.NewController( s.KcpSharedInformerFactory.Apis().V1alpha1().APIExportEndpointSlices(), // ClusterWorkspaceShards and APIExports get retrieved from cache server s.KcpSharedInformerFactory.Tenancy().V1alpha1().ClusterWorkspaceShards(), s.KcpSharedInformerFactory.Apis().V1alpha1().APIExports(), kcpClusterClient, )

As the cache server is K8 API compliant the same code should work with both. As there is a single shard all the relevant information should be provided by it.

What do you think?

We usually pass in both and have the controller try the local ones first, then fall back to the cache server ones. Right @stevekuznetsov @p0lyn0mial @sttts ?

I had the same question here: #2432 (comment)

It's a fundamental question of design here

We usually pass in both and have the controller try the local ones first, then fall back to the cache server ones.

With the approach you describe you need the informers for the cache server to be started. This is not the case today if the cache server is disabled. This could be done but I feel that we are conflating scenarios here:

Do we want kcp to work without any cache server at all. My understanding is that the answer is yes but it is then a single sharded kcp. My understanding is that the cache server is the communication mechanism between shards. My proposal here: passing local informers instead of cache informers was to address this specific scenario.

What is the best degradation we can offer when the cache server is not reachable and is supposed to be. My thinking on this scenario was the answer here: ✨ APIExportEndpointSlice reconciliation #2432 (comment)

Let's go through (2)

We have 10 shards, 1 shard gets updated:

if the cache server is not reachable before the reconciliation loop starts. It does not start for any APIExportEndpointSlice with the current implementation in this PR. With secondary local informers it would only start either for the root shard (current implementation) or for the modified shard (if ClusterWorkspaceShard gets created locally instead of the root shard). With the root shard approach, the APIExportEndpointSlices in the root shard gets the correct endpoints. The other are stale. It is 10% better, 1% if we have 100 shards. With ClusterWorkspaceShards created locally, the reconciliation logic sees only 1 shard, the local one with secondary informers. Any change would wipe out the enpoints for the remaining 9 shard and possibly be 9 times worse for the APIExportEndpointSlices on the modified shards. APIExportEndpointSlices on other shards don't see the change.

if the cache server is not reachable during the reconciliation loop no change is applied with the current implementation in this PR. With secondary informers, the information for APIExports on other shards won't be accessible 90% stay unchanged. For the remaining 10% the APIExport information is available. With ClusterWorkspaceShards created in the root shard, again APIExportEndpointSlices on the root shard get properly populated. Controller for APIExportEndpointSlices on other shards see zero ClusterWorkspaceShard resources and should wipe out all the endpoints, which would be far worse than doing nothing. With ClusterWorkspaceShards created locally, the controller sees only its local shard and should wipe out 9 out of the 10 endpoints, again far worse than doing nothing.

Let's not diverge between local and sharded mode. Let's run the cache server even in single-shard mode.

Let's run the cache server even in single-shard mode.

this is already implemented, we can run the cache server with a kcp instance in the same process (embedded mode)

As discussed I will modify the flags so that the cache server is started in embedded mode by default and as a separate process if additional information is provided.

pkg/reconciler/apis/apiexportendpointslice/apiexportendpointslice_controller.go

pkg/reconciler/apis/apiexportendpointslice/apiexportendpointslice_reconcile.go

stevekuznetsov · 2022-11-30T18:01:56Z

pkg/reconciler/apis/apiexportendpointslice/apiexportendpointslice_reconcile.go

+				apisv1alpha1.APIExportValid,
+				apisv1alpha1.InternalErrorReason,
+				conditionsv1alpha1.ConditionSeverityError,
+				"Error getting APIExport %s|%s: %v",


This error is something other than NotFound from a lister - unclear what it might even be, but does it make sense to update status here? Can we retry the reconciliation? What might a user do with the information here?

It's either impossible or essentially impossible to get other kinds of errors, given the current implementation of a lister. But in any case, I do think updating a condition is appropriate - it provides feedback to the user that something went wrong. They may not have any recourse, but it's better than giving them no information and having the key stuck in an endless exponential backoff loop.

pkg/reconciler/apis/apiexportendpointslice/apiexportendpointslice_reconcile.go

pkg/reconciler/apis/apiexport/apiexport_controller.go

pkg/reconciler/cache/replication/replication_reconcile.go

stevekuznetsov · 2022-11-30T18:06:29Z

pkg/server/controllers.go

+
+	c, err := apiexportendpointslice.NewController(
+		s.KcpSharedInformerFactory.Apis().V1alpha1().APIExportEndpointSlices(),
+		// ClusterWorkspaceShards and APIExports get retrieved from cache server


Does a shard have some local storage for its' own definition? Surely also there will be APIExports on the shard, that do not need to be requested from the cache - we can progress in processing that data even when the cache is down, right?

Does a shard have some local storage for its' own definition?

Currently not, it is all in the root workspace of the root shard. As per the above I think we would need it.

here will be APIExports on the shard, that do not need to be requested from the cache - we can progress in processing that data even when the cache is down, right?

That's true but I am not sure it is worth it. We need the shard information for getting more than a single endpoint. This can only be retrieved from the cache server. I would rather keep things "frozen" when the cache server cannot be reached: existing services are not impacted by the unavailability of the cache server as long as the shards don't change. Looking only at the information on the local shard, which is currently only available for the root shard if the APIExport has been created there would only give a partial picture that may be worse than the stale snapshot captured in the APIExportEndpointSlices before the unavailability of the cache server.

I think that is against our principle of single-shard functionality without external data. We need to support "working as much as possible" when the larger system has network partitions.

@p0lyn0mial would appreciate your thoughts on this

Does a shard have some local storage for its' own definition? Surely also there will be APIExports on the shard, that do not need to be requested from the cache - we can progress in processing that data even when the cache is down, right?

yes, so, as of today our approach is to use two informers (maybe in the future we could unify and provide a single instance/interface).

The approeach so far was to first check the local informer and falling back to the second backed by the cache server. For example:

getAPIExport: func(clusterName logicalcluster.Name, name string) (*apisv1alpha1.APIExport, error) { apiExport, err := localAPIExportInformer.Lister().Cluster(clusterName).Get(name) if errors.IsNotFound(err) { return cacheApiExportInformer.Lister().Cluster(clusterName).Get(name) } return apiExport, err }

I think we could apply this strategy to all places that require working with the caching layer.

When it comes to working with Shard resources in an event of a failure, we could at least register the shard we are currently running on since all we need is Spec.VirtualWorkspaceURL.

Does it make sense?

I think we could apply this strategy to all places that require working with the caching layer.

I think that we need to discriminate between two types of requests:

I am looking for APIExport A, or another global resource type, and the existing approach: check locally and then the cache looks ok to me.

I want to build a complete view and I am looking for all shards. This is the scenario here. In that case I don't think that local requests bring some added value.

When it comes to working with Shard resources in an event of a failure, we could at least register the shard we are currently running on since all we need is Spec.VirtualWorkspaceURL

If the APIExportEndpointSlice was just created, nothing is broken, just not yet working. If the cache server is "temporarily" not accessible it is debatable but my gut feeling is that it is simpler, clearer, cleaner to wait for it to get reachable again and the reconciliation to happen.
If the APIExportEndpointSlice already existed, and had already been reconciled once, looking at the local shard only, may (not today) return a single shard. Reconciliation against that may mean to remove hundred of valid existing URLs and possibly one invalid (if a shard change was the trigger for the reconciliation) and replace them by a single URL (the local shard). I would argue that it is far worse than doing nothing, keeping the stale state and waiting for the cache server to be reachable again to have the state finally up-to-date.

to come back to:

Does a shard have some local storage for its' own definition?

As per my previsous answer: No. I am proposing to change that, to stop having the root shard being special and the cache server becoming the preferred communication mechanism between shards. I feel however that it is orthogonal to this pull request.

In general we won't gate a shard when the cache server is not ready/available. That means that a local cache (informer) might give you an empty list.

Also, nothing will remove the content received (informer) from the cache server when it becomes unavailiable. In the future we might provide users with a "staleness" information that would help them drive some decicions.

Given the above, we need to decide what to do after a restart and without the access to the cache server.

I was proposing to add at least a local shard instead of wiping the list (/~https://github.com/kcp-dev/kcp/pull/2432/files#diff-a028d37be8294707fd6fdcb63d85f41486832f61eb70c14605fc5fc712958f48R134).

I'm fine with not changing already populated list but that would make the code more complicated because we need to distinguish between a list of shards with only 1 item and a list with only a local shard.

Added a commit following our slack discussion. My preference would be without it.

ncdc · 2022-12-01T18:27:10Z

/milestone v0.11

fgiloux · 2022-12-06T19:50:09Z

/retest

fgiloux · 2022-12-06T20:30:16Z

/retest

fgiloux · 2022-12-07T06:10:48Z

/retest

fgiloux · 2022-12-07T06:17:18Z

@ncdc @sttts I have completed the changes, the tests pass except for ppc64le but this is an infrastructure issue.
Note, I have silenced the lint staticcheck warnings due to the deprecation of the VirtualWorkspaces field in the status of APIExport. It still needs to be supported and removed at a later stage.
Let me know if there are concerns left with this PR.

pkg/reconciler/apis/apiexportendpointslice/apiexportendpointslice_controller_test.go

pkg/cache/server/config.go

sttts · 2023-01-05T11:38:44Z

pkg/reconciler/apis/apiexportendpointslice/apiexportendpointslice_controller.go

+			runtime.HandleError(err)
+			continue
+		} else if !exists {
+			runtime.HandleError(fmt.Errorf("APIBinding %q does not exist", key))


this is not an error. The index can change at any time, especially since the IndexKeys call above.

sttts · 2023-01-05T11:42:14Z

/lgtm
/approve

openshift-ci · 2023-01-05T11:42:29Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: sttts

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [sttts]
~~config/crds/OWNERS~~ [sttts]
~~pkg/apis/OWNERS~~ [sttts]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

sttts · 2023-01-05T11:42:30Z

/hold

sttts · 2023-01-05T11:43:55Z

pkg/reconciler/apis/apiexportendpointslice/apiexportendpointslice_controller.go

+	return utilerrors.NewAggregate(errs)
+}
+
+func filterShardEvent(oldObj, newObj interface{}) bool {


nit: in the filtered event handler for informers the logic is inverse, i.e. false for drop, true for keep. Maybe worth to follow that logic here too?

yes. I have now inverted the logic.

Signed-off-by: Frederic Giloux <fgiloux@redhat.com>

…mote server when cache-server-kubeconfig-file is specified Signed-off-by: Frederic Giloux <fgiloux@redhat.com>

This prevents SSA to work with kcp when the cache server is embedded. Signed-off-by: Frederic Giloux <fgiloux@redhat.com>

Signed-off-by: Frederic Giloux <fgiloux@redhat.com>

… enqueue logic but not from the reconcile one. Remove unnecessary getters. Signed-off-by: Frederic Giloux <fgiloux@redhat.com>

…VirtualWorkspaceURL or labels have changed Signed-off-by: Frederic Giloux <fgiloux@redhat.com>

sttts · 2023-01-05T12:22:31Z

/lgtm

sttts · 2023-01-05T12:35:03Z

/retest

sttts · 2023-01-05T12:53:01Z

/hold cancel

fgiloux · 2023-01-05T12:59:11Z

/retest

openshift-ci bot requested review from ncdc and stevekuznetsov November 30, 2022 07:53

openshift-ci bot added the kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API label Nov 30, 2022

fgiloux force-pushed the slice-reconciliation branch 3 times, most recently from 900b199 to 785e70a Compare November 30, 2022 10:03

ncdc suggested changes Nov 30, 2022

View reviewed changes

openshift-ci bot assigned ncdc Nov 30, 2022

stevekuznetsov reviewed Nov 30, 2022

View reviewed changes

fgiloux force-pushed the slice-reconciliation branch 2 times, most recently from 4868e4b to 374454b Compare December 1, 2022 16:10

fgiloux force-pushed the slice-reconciliation branch 5 times, most recently from b55f98b to 2d05eb3 Compare December 6, 2022 19:01

openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Dec 6, 2022

fgiloux force-pushed the slice-reconciliation branch from 2d05eb3 to bb00beb Compare December 6, 2022 19:18

openshift-merge-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Dec 6, 2022

fgiloux force-pushed the slice-reconciliation branch from bb00beb to 152d9c7 Compare December 6, 2022 19:37

fgiloux force-pushed the slice-reconciliation branch from 152d9c7 to 09bfca0 Compare December 6, 2022 21:21

stevekuznetsov reviewed Dec 7, 2022

View reviewed changes

pkg/reconciler/apis/apiexportendpointslice/apiexportendpointslice_controller_test.go Show resolved Hide resolved

ncdc reviewed Dec 7, 2022

View reviewed changes

pkg/cache/server/config.go Outdated Show resolved Hide resolved

fgiloux force-pushed the slice-reconciliation branch 2 times, most recently from f175b5c to 3f9673c Compare December 8, 2022 13:17

fgiloux force-pushed the slice-reconciliation branch from ecb6df4 to 666d657 Compare January 5, 2023 11:07

sttts reviewed Jan 5, 2023

View reviewed changes

openshift-ci bot assigned sttts Jan 5, 2023

openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Jan 5, 2023

openshift-ci bot added approved Indicates a PR has been approved by an approver from all required OWNERS files. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. labels Jan 5, 2023

sttts reviewed Jan 5, 2023

View reviewed changes

fgiloux added 7 commits January 5, 2023 13:11

First reconciliation of APIExportEndpointSlices, no filtering

3ffe547

Signed-off-by: Frederic Giloux <fgiloux@redhat.com>

Make the cache server mandatory, either embedded (default) or as a re…

fe14d33

…mote server when cache-server-kubeconfig-file is specified Signed-off-by: Frederic Giloux <fgiloux@redhat.com>

Remove the SSA deactivation introduced for the cache server.

589e9e3

This prevents SSA to work with kcp when the cache server is embedded. Signed-off-by: Frederic Giloux <fgiloux@redhat.com>

Fix spelling in file test/e2e/cache/cache_server_test.go

e22a924

Deactivate server side apply for the cache server only in embedded mode.

1848e2e

Signed-off-by: Frederic Giloux <fgiloux@redhat.com>

Rename ClusterWorkspaceShard -> Shard in messages and variable names

f899fbe

Signed-off-by: Frederic Giloux <fgiloux@redhat.com>

Add a second struct to have access to the APIExport informer from the…

879c62c

… enqueue logic but not from the reconcile one. Remove unnecessary getters. Signed-off-by: Frederic Giloux <fgiloux@redhat.com>

fgiloux force-pushed the slice-reconciliation branch from 666d657 to 4a55275 Compare January 5, 2023 12:11

openshift-ci bot removed the lgtm Indicates that a PR is ready to be merged. label Jan 5, 2023

Add shard filtering, enqueuing APIExportEndpointSlices only when the …

12d0473

…VirtualWorkspaceURL or labels have changed Signed-off-by: Frederic Giloux <fgiloux@redhat.com>

fgiloux force-pushed the slice-reconciliation branch from 4a55275 to 12d0473 Compare January 5, 2023 12:20

openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Jan 5, 2023

sttts mentioned this pull request Jan 5, 2023

⚠️ ClusterWorkspace => LogicalCluster refactor #2510

Merged

34 tasks

openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 5, 2023

fgiloux mentioned this pull request Jan 5, 2023

feature: APIExportEndpointSlice reconciliation #2332

Closed

1 task

openshift-merge-robot merged commit ca46574 into kcp-dev:main Jan 5, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

✨ APIExportEndpointSlice reconciliation #2432

✨ APIExportEndpointSlice reconciliation #2432

fgiloux commented Nov 30, 2022 •

edited by sttts

Loading

ncdc Nov 30, 2022

fgiloux Dec 1, 2022

ncdc Dec 1, 2022

stevekuznetsov Dec 1, 2022

fgiloux Dec 1, 2022 •

edited

Loading

sttts Dec 5, 2022

p0lyn0mial Dec 5, 2022

fgiloux Dec 5, 2022

stevekuznetsov Nov 30, 2022

ncdc Nov 30, 2022

stevekuznetsov Nov 30, 2022

fgiloux Dec 1, 2022

stevekuznetsov Dec 1, 2022

ncdc Dec 15, 2022

p0lyn0mial Jan 2, 2023

fgiloux Jan 2, 2023

p0lyn0mial Jan 2, 2023 •

edited

Loading

fgiloux Jan 3, 2023

ncdc commented Dec 1, 2022

fgiloux commented Dec 6, 2022

fgiloux commented Dec 6, 2022

fgiloux commented Dec 7, 2022

fgiloux commented Dec 7, 2022

sttts Jan 5, 2023

fgiloux Jan 5, 2023

sttts commented Jan 5, 2023

openshift-ci bot commented Jan 5, 2023

sttts commented Jan 5, 2023

sttts Jan 5, 2023

fgiloux Jan 5, 2023

sttts commented Jan 5, 2023

sttts commented Jan 5, 2023

sttts commented Jan 5, 2023

fgiloux commented Jan 5, 2023

✨ APIExportEndpointSlice reconciliation #2432

✨ APIExportEndpointSlice reconciliation #2432

Conversation

fgiloux commented Nov 30, 2022 • edited by sttts Loading

Summary

Related issue(s)

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

fgiloux Dec 1, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

p0lyn0mial Jan 2, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ncdc commented Dec 1, 2022

fgiloux commented Dec 6, 2022

fgiloux commented Dec 6, 2022

fgiloux commented Dec 7, 2022

fgiloux commented Dec 7, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sttts commented Jan 5, 2023

openshift-ci bot commented Jan 5, 2023

sttts commented Jan 5, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sttts commented Jan 5, 2023

sttts commented Jan 5, 2023

sttts commented Jan 5, 2023

fgiloux commented Jan 5, 2023

fgiloux commented Nov 30, 2022 •

edited by sttts

Loading

fgiloux Dec 1, 2022 •

edited

Loading

p0lyn0mial Jan 2, 2023 •

edited

Loading