Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cache an async function without calling .await #133

Closed
absoludity opened this issue Oct 24, 2022 · 8 comments
Closed

Cache an async function without calling .await #133

absoludity opened this issue Oct 24, 2022 · 8 comments

Comments

@absoludity
Copy link
Contributor

absoludity commented Oct 24, 2022

A question (sorry for any confusion herein) : is there a pattern to be able to cache an async fn() without calling .await if the result is already cached?

That is, the normal pattern:

/// should only sleep the first time it's called
#[cached]
async fn cached_sleep_secs(secs: u64) {
sleep(Duration::from_secs(secs)).await;
}

which is then called with

cached_sleep_secs(1).await;

. So even just checking the in-memory cache and returning the cached result - a non-async operation - is handled as an async.

Why is this an issue? AFAICT, when used with tokio, the call to .await is correctly handled as an i/o event and so the task is slept, in this case unnecessarily. With lots of requests, this is leading to CPU starvation. What I'd like to do is something like manually:

let res = match cache.has(key) {
    true => cache.get(key),
    false => decorated_fn().await
};
// continue with rest of fn

but I'm not sure how I can access the cache global (or maybe need to use a non-proc macro?).

@jaemk
Copy link
Owner

jaemk commented Oct 24, 2022

Unfortunately if you want to manipulate a cache without async operations while populating it from an async operation like your last example, then you have to do so manually without any macros.

In order for the macros to preserve the function signature of the functions being cached, all of the code (explicit and generated) has to live inside of the async function definition. Additionally, cached async functions will use async synchronization types (async-mutex, async-rw-lock) which will introduce additional wait points, but is necessary for the locking to function cooperatively with the async control flow.

So to your question - if you want to operate on the cache without async locks, you can, but you need to do so by manually defining your cache and wrapping it in a non-async synchronization type (std Mutex/RwLock). You can then write something like your last example above (with the addition of calls to .lock/read/write to synchronize container access). This will work, however the fact that you are then calling an async function means that this code must exist in an async context, which means that usage of the non-async synchronization types will essentially be equivalent to "blocking io" when there is any lock contention since waiting for the lock will not be a concurrent operation.

And your last question: by default (you can specify otherwise to the macro) the global cache is defined as the function name in all caps. But, keep in mind that global cache needs to be synchronized and will be wrapped in a synchronization type. The synchronization type will be sync/async depending on the async-ness of the function.

@absoludity
Copy link
Contributor Author

OK, thanks James for the thought out response to my vague question - that helps a lot. Having re-read that a few times, I think, whether or not it's possible, what I seem to be trying for or wanting is a cache wrapped in std sync for reads but async for writes. Though I'm not sure it'd even fix my issue (rather than instead increase the blocking io).

Thanks again.

@absoludity
Copy link
Contributor Author

Additionally, cached async functions will use async synchronization types (async-mutex, async-rw-lock) which will introduce additional wait points, but is necessary for the locking to function cooperatively with the async control flow.

Actually, I have another question about this, if you have time (sorry): it's not clear to me why we'd necessarily need an async rw-lock here, unless we're needing to call .await while holding the lock - which we don't necessarily. Isn't the write-lock only required while writing back the value to the cache.

Unless you mean that because cached must use the same function signature, it is therefore forced to be calling .await inside the wrapping RwLock? If that's the only reason, then a manual cache with a sync RwLock should work fine, I think? (Re-reading your comment, that's what you'd actually said - so yes, perhaps I'll try that).

@jaemk
Copy link
Owner

jaemk commented Oct 25, 2022

then a manual cache with a sync RwLock should work fine

Yes, that's correct, but as you mentioned you need to be careful not to call any async methods while holding the lock since the executor may "yield" to another task that is trying to acquire the same lock while the lock is still held

@absoludity
Copy link
Contributor Author

FWIW, it does indeed halve the query time for my 50 requests by using a standard sync read-only lock on the cache. Unfortunately though, because the Cached trait requires a mutable reference for for get:

    fn cache_get(&mut self, k: &K) -> Option<&V>;

which, AFAICT, is just for the hit/miss counts, you can't call .get with a read-only lock, so I'm currently having to us my own hash.

Let me know if you think that's an issue (that a mutable ref is required for get) and I'll create an issue and try to get you a PR for it. Having a mutable ref there also stops you from using other patterns for the cache, such as promoting new state rather than mutating existing state (which is the way I'm heading in my issue, for simplicity).

@jaemk
Copy link
Owner

jaemk commented Oct 26, 2022

That's an interesting pattern. I'm not sure we can replace everything wholesale with the promoting version though since there's some things like hit/miss counts, certain cache types (lru), and synchronized-writes that rely on mutating the shared value. I would be open to introducing a new macro option though that generates code following this pattern for constrained cases, e.g. any type that could support non-mut gets with the caveat that hit/miss counters will not function and write heavy logic could result in very high memory consumption.

@absoludity
Copy link
Contributor Author

Yeah - the promoting new state pattern was just a side-note, in that I'd be able to use the existing ExpiringCredentialCache struct with that pattern in my own code, if cache_get didn't require &mut self. I simply tried to update that type to use &self instead, but it then can no longer implement the Cached trait, I think.

@jaemk
Copy link
Owner

jaemk commented Oct 26, 2022

I think the move here is to create a separate trait CachedNotMutGet (maybe there's a better name) and implement that for any type that can support it to indicate the ones that could be used with a promotion pattern

absoludity added a commit to vmware-tanzu/kubeapps that referenced this issue Nov 18, 2022
Signed-off-by: Michael Nelson <minelson@vmware.com>

<!--
Before you open the request please review the following guidelines and
tips to help it be more easily integrated:

 - Describe the scope of your change - i.e. what the change does.
 - Describe any known limitations with your change.
- Please run any tests or examples that can exercise your modified code.

 Thank you for contributing!
 -->

Follows on from #5518, this time replacing the `cached` package with a
custom credential cache.

### Description of the change

After further digging, I found that one cause of the slow handling of 50
concurrent requests going through the pinniped-proxy was that:
- We were caching a function with an async/await signature which means
that even the cached version must have that signature as well - which
means a blocking i/o call (which switches the task), and
- The `Cached` trait specifies that even a `cache_get` operation mutates
the cache (in our case, just for statistics of hits/misses), which, as a
result, requires acquiring a *write* lock to the cache to read a cached
value.

For more details, please see the [discussion with the `Cached`
author](jaemk/cached#133).

To avoid both of those issues, this PR:
1. Adds a `cache` module that provides a generic read/write
`LockableCache` (for multiple readers, single writer) and builds on that
with a `PruningCache` that will prune entries (given a test function)
when they should no longer be cached,
2. Uses (1) to create a single `CredentialCache` on startup (in
`main.rs`) specifically for caching `TokenCredentialRequest` objects and
pruning expired entries, and then passes this through for use in
different threads concurrently.
3. Uses the cache to fetch the credentials.
<!-- Describe the scope of your change - i.e. what the change does. -->

### Benefits

Fetching from the cache is now non-blocking (generally, except when an
entry is being added) and so leads to less task switching, improving the
total query time by ~2s (down to 3-4).

There is still something else using significant CPU when creating the
client itself (cert-related), which I'm investigating now in a separate
PR.
<!-- What benefits will be realized by the code change? -->

### Possible drawbacks

<!-- Describe any known limitations with your change -->

### Applicable issues

<!-- Enter any applicable Issues here (You can reference an issue using
#) -->

- Ref #5407 

### Additional information

<!-- If there's anything else that's important and relevant to your pull
request, mention that information here.-->

Example log when using `RUST_LOG=info,pinniped_proxy::pinniped=debug`
which shows the cache being used after the first request. I've not
included it in the output generally, but the cache get is now always
under a millisecond. As above, the significant delays (some calls to
prepare_and_call_pinniped_exchange only 4ms, others 98ms) are what I'll
look at next.

```
2022-10-27T01:42:47.820245 [INFO] - Listening on http://0.0.0.0:3333
2022-10-27T01:43:05.077116 [DEBUG] - prepare_and_call_pinniped_exchange took 17ms. Used cache?: false
2022-10-27T01:43:05.085273 [INFO] - GET https://kubernetes.default/api?timeout=32s 200 OK
2022-10-27T01:43:05.091663 [DEBUG] - prepare_and_call_pinniped_exchange took 5ms. Used cache?: true
2022-10-27T01:43:05.100437 [INFO] - GET https://kubernetes.default/apis?timeout=32s 200 OK
2022-10-27T01:43:05.106005 [DEBUG] - prepare_and_call_pinniped_exchange took 4ms. Used cache?: true
2022-10-27T01:43:05.209952 [DEBUG] - prepare_and_call_pinniped_exchange took 21ms. Used cache?: true
2022-10-27T01:43:05.299424 [DEBUG] - prepare_and_call_pinniped_exchange took 5ms. Used cache?: true
2022-10-27T01:43:05.311599 [DEBUG] - prepare_and_call_pinniped_exchange took 5ms. Used cache?: true
2022-10-27T01:43:05.493269 [DEBUG] - prepare_and_call_pinniped_exchange took 98ms. Used cache?: true
2022-10-27T01:43:05.593683 [DEBUG] - prepare_and_call_pinniped_exchange took 4ms. Used cache?: true
2022-10-27T01:43:05.604348 [DEBUG] - prepare_and_call_pinniped_exchange took 4ms. Used cache?: true
2022-10-27T01:43:05.697828 [DEBUG] - prepare_and_call_pinniped_exchange took 87ms. Used cache?: true
2022-10-27T01:43:05.811590 [DEBUG] - prepare_and_call_pinniped_exchange took 20ms. Used cache?: true
2022-10-27T01:43:06.004358 [DEBUG] - prepare_and_call_pinniped_exchange took 94ms. Used cache?: true
2022-10-27T01:43:06.098603 [DEBUG] - prepare_and_call_pinniped_exchange took 5ms. Used cache?: true
2022-10-27T01:43:06.108756 [DEBUG] - prepare_and_call_pinniped_exchange took 4ms. Used cache?: true

```

Signed-off-by: Michael Nelson <minelson@vmware.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants