-
Notifications
You must be signed in to change notification settings - Fork 710
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve response times in Kubeapps APIs when using Pinniped-proxy #5407
Comments
Thanks @castelblanque . Yeah, that's intriguing given that we're caching the credential. I'll take a look. |
So I started looking at this today - and it seems quite an interesting problem/bug. For what it's worth, I don't think it's related to throttling of requests to the kubernetes-api server from pinniped-proxy, but rather, an issue in the way async requests are handled by In summary, Let's see this in the detail: I added a few extra logging statements during the proxy operation so we can tell when the request begins, when the id/cert exchange happens and when the actual downstream request begins (or see this gist for logging with trace of tokio's mio crate ) The first three queries are proxied and returned (last being the /apis) and each takes between 30-40ms:
It is this response which, when returned upstream to the
So that began at 43.708, with the "Beginning - Exchanging - Requesting" for each request taking only around 10ms, but no requests are completed until all the concurrent incoming requests have been processed. So at 45.718, a full 2s later, the requests are all returned by the proxy, until 45.902, so over a period of 200ms.
Read in-depth overview of the tokio scheduler. In particular the addition of the It'll probably be a simple configuration, but I'll continue tomorrow. |
Interesting findings!
Is that an actual bug in tokio? |
I doubt it, more likely just a config option or something I've missed, but I'll hopefully find out tomorrow.
Yeah - all incoming requests get processed before it moves on to return the results for each, though it confuses me that they seem to be done serially. Interesting anyway :) |
So, summary of findings after creating a minimal working example:
Using my minimal working example, I can reproduce the results seen by explicitly:
If only one of these is done, the queries can still execute quickly. I'll profile the credential cache access used in pinniped-proxy next. |
...and found it: 2 things to note in the logs below:
Finally I can now actually try to fix the issue :)
|
And the winner is the Basically forming a queue to sequentially get the credential before each can make the proxied request. Without |
Awesome work Michael. So the write-lock was blocking all subsequent requests. This is something that still amazes me from programming: such a simple fix, but soo hard to pinpoint 😄 |
Awesome finding Michael! |
Thanks - it was a lot of fun to track down (good chance to learn quite a bit). I just had to share here because I'm working all on my own in my TZ with no-one to high-five with :) |
Just an update on this: as mentioned on the linked PR, the issue that I found last week (mentioned above) fixes one aspect, but the requests still takes way too long. I've spent the past couple of days digging to find the issue and finding that
I didn't have much success finding out how the above could be true until this morning, when realising that because the cached function At least, this is what I will next check, though I need to take a break from this to get some other things done (hence dumping thoughts here). The approach will be, rather than decorating the async function with the |
Signed-off-by: Michael Nelson <minelson@vmware.com> <!-- Before you open the request please review the following guidelines and tips to help it be more easily integrated: - Describe the scope of your change - i.e. what the change does. - Describe any known limitations with your change. - Please run any tests or examples that can exercise your modified code. Thank you for contributing! --> ### Description of the change This PR is the result of the investigation described on #5407 , the end result of which is a single line removal, though I've left some other changes that were helpful during debugging (logging related). ### Benefits Pinniped-proxy does not stack all requests behind a write lock when called concurrently. ### Possible drawbacks None. <!-- Describe any known limitations with your change --> ### Applicable issues <!-- Enter any applicable Issues here (You can reference an issue using #) --> - ref #5407 ### Additional information It does not solve the slow request when running in a local kind cluster, as each task is sleeping while accessing the cache, leading to anywhere between 4 and 100ms just to get the cached response. Signed-off-by: Michael Nelson <minelson@vmware.com>
Signed-off-by: Michael Nelson <minelson@vmware.com> <!-- Before you open the request please review the following guidelines and tips to help it be more easily integrated: - Describe the scope of your change - i.e. what the change does. - Describe any known limitations with your change. - Please run any tests or examples that can exercise your modified code. Thank you for contributing! --> Follows on from #5518, this time replacing the `cached` package with a custom credential cache. ### Description of the change After further digging, I found that one cause of the slow handling of 50 concurrent requests going through the pinniped-proxy was that: - We were caching a function with an async/await signature which means that even the cached version must have that signature as well - which means a blocking i/o call (which switches the task), and - The `Cached` trait specifies that even a `cache_get` operation mutates the cache (in our case, just for statistics of hits/misses), which, as a result, requires acquiring a *write* lock to the cache to read a cached value. For more details, please see the [discussion with the `Cached` author](jaemk/cached#133). To avoid both of those issues, this PR: 1. Adds a `cache` module that provides a generic read/write `LockableCache` (for multiple readers, single writer) and builds on that with a `PruningCache` that will prune entries (given a test function) when they should no longer be cached, 2. Uses (1) to create a single `CredentialCache` on startup (in `main.rs`) specifically for caching `TokenCredentialRequest` objects and pruning expired entries, and then passes this through for use in different threads concurrently. 3. Uses the cache to fetch the credentials. <!-- Describe the scope of your change - i.e. what the change does. --> ### Benefits Fetching from the cache is now non-blocking (generally, except when an entry is being added) and so leads to less task switching, improving the total query time by ~2s (down to 3-4). There is still something else using significant CPU when creating the client itself (cert-related), which I'm investigating now in a separate PR. <!-- What benefits will be realized by the code change? --> ### Possible drawbacks <!-- Describe any known limitations with your change --> ### Applicable issues <!-- Enter any applicable Issues here (You can reference an issue using #) --> - Ref #5407 ### Additional information <!-- If there's anything else that's important and relevant to your pull request, mention that information here.--> Example log when using `RUST_LOG=info,pinniped_proxy::pinniped=debug` which shows the cache being used after the first request. I've not included it in the output generally, but the cache get is now always under a millisecond. As above, the significant delays (some calls to prepare_and_call_pinniped_exchange only 4ms, others 98ms) are what I'll look at next. ``` 2022-10-27T01:42:47.820245 [INFO] - Listening on http://0.0.0.0:3333 2022-10-27T01:43:05.077116 [DEBUG] - prepare_and_call_pinniped_exchange took 17ms. Used cache?: false 2022-10-27T01:43:05.085273 [INFO] - GET https://kubernetes.default/api?timeout=32s 200 OK 2022-10-27T01:43:05.091663 [DEBUG] - prepare_and_call_pinniped_exchange took 5ms. Used cache?: true 2022-10-27T01:43:05.100437 [INFO] - GET https://kubernetes.default/apis?timeout=32s 200 OK 2022-10-27T01:43:05.106005 [DEBUG] - prepare_and_call_pinniped_exchange took 4ms. Used cache?: true 2022-10-27T01:43:05.209952 [DEBUG] - prepare_and_call_pinniped_exchange took 21ms. Used cache?: true 2022-10-27T01:43:05.299424 [DEBUG] - prepare_and_call_pinniped_exchange took 5ms. Used cache?: true 2022-10-27T01:43:05.311599 [DEBUG] - prepare_and_call_pinniped_exchange took 5ms. Used cache?: true 2022-10-27T01:43:05.493269 [DEBUG] - prepare_and_call_pinniped_exchange took 98ms. Used cache?: true 2022-10-27T01:43:05.593683 [DEBUG] - prepare_and_call_pinniped_exchange took 4ms. Used cache?: true 2022-10-27T01:43:05.604348 [DEBUG] - prepare_and_call_pinniped_exchange took 4ms. Used cache?: true 2022-10-27T01:43:05.697828 [DEBUG] - prepare_and_call_pinniped_exchange took 87ms. Used cache?: true 2022-10-27T01:43:05.811590 [DEBUG] - prepare_and_call_pinniped_exchange took 20ms. Used cache?: true 2022-10-27T01:43:06.004358 [DEBUG] - prepare_and_call_pinniped_exchange took 94ms. Used cache?: true 2022-10-27T01:43:06.098603 [DEBUG] - prepare_and_call_pinniped_exchange took 5ms. Used cache?: true 2022-10-27T01:43:06.108756 [DEBUG] - prepare_and_call_pinniped_exchange took 4ms. Used cache?: true ``` Signed-off-by: Michael Nelson <minelson@vmware.com>
Signed-off-by: Michael Nelson <minelson@vmware.com> <!-- Before you open the request please review the following guidelines and tips to help it be more easily integrated: - Describe the scope of your change - i.e. what the change does. - Describe any known limitations with your change. - Please run any tests or examples that can exercise your modified code. Thank you for contributing! --> Reduces the proxying of the ~50 API calls (when the go-lang kubeapps-apis initialises its REST mapper) from 4-6s down to a consistent 1.3-1.6s (on my single computer running a cluster including pinniped-proxy). ### Description of the change <!-- Describe the scope of your change - i.e. what the change does. --> After more profiling, the remaining offender was the call to create a `kube::Client` which ranged from 4-100ms (of CPU + interrupts - not sure if there was any I/O) for every single request. So when 50 requests are issued concurrently, it was starving the CPU available (on my single laptop running a cluster etc.). ~~The call to create the `kube::Client` is dependent on the target k8s api server only, so it's easy to cache as the number of different results is equal to the number of clusters targeted by Kubeapps.~~ EDIT: Actually, I just realised we may not even need to cache it, since we can simple move where it's evaluated so we only evaluate it when the token was not cached. I'll try that before marking ready. ### Benefits <!-- What benefits will be realized by the code change? --> Pinniped proxy handles the 50 concurrent request/responses in a more reasonable time. ### Possible drawbacks <!-- Describe any known limitations with your change --> ### Applicable issues <!-- Enter any applicable Issues here (You can reference an issue using #) --> - fixes #5407 (finally) ### Additional information <!-- If there's anything else that's important and relevant to your pull request, mention that information here.--> Without this change, the main cause of CPU was the call to `get_client_config`, which then delays all the proxy requests starting: ``` 2022-10-28T01:53:54.191216 [TRACE] - prepare_and_call_pinniped_exchange took 4ms. Used cache?: true. 4ms for get_client_config, 0ms for creating empty cred data, 0ms to fetch the cred. 2022-10-28T01:53:54.291704 [TRACE] - prepare_and_call_pinniped_exchange took 22ms. Used cache?: true. 22ms for get_client_config, 0ms for creating empty cred data, 0ms to fetch the cred. 2022-10-28T01:53:54.470193 [TRACE] - prepare_and_call_pinniped_exchange took 84ms. Used cache?: true. 84ms for get_client_config, 0ms for creating empty cred data, 0ms to fetch the cred. 2022-10-28T01:53:54.585938 [TRACE] - prepare_and_call_pinniped_exchange took 14ms. Used cache?: true. 14ms for get_client_config, 0ms for creating empty cred data, 0ms to fetch the cred. 2022-10-28T01:53:54.674279 [TRACE] - prepare_and_call_pinniped_exchange took 4ms. Used cache?: true. 4ms for get_client_config, 0ms for creating empty cred data, 0ms to fetch the cred. 2022-10-28T01:53:54.683590 [TRACE] - prepare_and_call_pinniped_exchange took 4ms. Used cache?: true. 4ms for get_client_config, 0ms for creating empty cred data, 0ms to fetch the cred. 2022-10-28T01:53:54.769326 [TRACE] - prepare_and_call_pinniped_exchange took 80ms. Used cache?: true. 80ms for get_client_config, 0ms for creating empty cred data, 0ms to fetch the cred. ``` After the change the call is effectively zero. ``` 2022-10-28T05:28:18.635146 [TRACE] - prepare_and_call_pinniped_exchange took 0ms. Used cache?: true. 0ms for get_client_config, 0ms for creating empty cred data, 0ms to fetch the cred. 2022-10-28T05:28:18.635306 [TRACE] - prepare_and_call_pinniped_exchange took 0ms. Used cache?: true. 0ms for get_client_config, 0ms for creating empty cred data, 0ms to fetch the cred. 2022-10-28T05:28:18.688264 [TRACE] - prepare_and_call_pinniped_exchange took 0ms. Used cache?: true. 0ms for get_client_config, 0ms for creating empty cred data, 0ms to fetch the cred. 2022-10-28T05:28:18.688497 [TRACE] - prepare_and_call_pinniped_exchange took 0ms. Used cache?: true. 0ms for get_client_config, 0ms for creating empty cred data, 0ms to fetch the cred. 2022-10-28T05:28:18.692989 [TRACE] - prepare_and_call_pinniped_exchange took 0ms. Used cache?: true. 0ms for get_client_config, 0ms for creating empty cred data, 0ms to fetch the cred. ``` Signed-off-by: Michael Nelson <minelson@vmware.com>
Summary
Some requests (e.g.
core.packages.v1alpha1.PackagesService.GetAvailablePackageSummaries
) are slow when usingpinniped-proxy
. Investigate and improve those times.Description
After work done for #5362 and #4919, response times have improved due to not creating unused multiple K8s Go clients.
However, there is still what seems to be a throttling problem when using
pinniped-proxy
. Multiple requests are done fromkubeapps-apis
to K8s API (with correct QPS and Burst values), but requests go throughpinniped-proxy
.Log from the proxy:
When
kubeapps-apis
is configured withoutpinniped-proxy
, response times for those requests are way faster. This might indicate that there is a throttling problem frompinniped-proxy
towards the actual K8s API server.Acceptance criteria
pinniped-proxy
., to similar times obtained withoutpinniped-proxy
.The text was updated successfully, but these errors were encountered: