Cache an async function without calling `.await` #133

absoludity · 2022-10-24T06:21:55Z

A question (sorry for any confusion herein) : is there a pattern to be able to cache an async fn() without calling .await if the result is already cached?

That is, the normal pattern:

cached/examples/async_std.rs

Lines 10 to 14 in f5911dc

    
           /// should only sleep the first time it's called 
        
           #[cached] 
        
           async fn cached_sleep_secs(secs: u64) { 
        
               sleep(Duration::from_secs(secs)).await; 
        
           }

which is then called with

cached/examples/async_std.rs

Line 104 in f5911dc

cached_sleep_secs(1).await;

. So even just checking the in-memory cache and returning the cached result - a non-async operation - is handled as an async.

Why is this an issue? AFAICT, when used with tokio, the call to .await is correctly handled as an i/o event and so the task is slept, in this case unnecessarily. With lots of requests, this is leading to CPU starvation. What I'd like to do is something like manually:

let res = match cache.has(key) {
    true => cache.get(key),
    false => decorated_fn().await
};
// continue with rest of fn

but I'm not sure how I can access the cache global (or maybe need to use a non-proc macro?).

The text was updated successfully, but these errors were encountered:

jaemk · 2022-10-24T12:42:14Z

Unfortunately if you want to manipulate a cache without async operations while populating it from an async operation like your last example, then you have to do so manually without any macros.

In order for the macros to preserve the function signature of the functions being cached, all of the code (explicit and generated) has to live inside of the async function definition. Additionally, cached async functions will use async synchronization types (async-mutex, async-rw-lock) which will introduce additional wait points, but is necessary for the locking to function cooperatively with the async control flow.

So to your question - if you want to operate on the cache without async locks, you can, but you need to do so by manually defining your cache and wrapping it in a non-async synchronization type (std Mutex/RwLock). You can then write something like your last example above (with the addition of calls to .lock/read/write to synchronize container access). This will work, however the fact that you are then calling an async function means that this code must exist in an async context, which means that usage of the non-async synchronization types will essentially be equivalent to "blocking io" when there is any lock contention since waiting for the lock will not be a concurrent operation.

And your last question: by default (you can specify otherwise to the macro) the global cache is defined as the function name in all caps. But, keep in mind that global cache needs to be synchronized and will be wrapped in a synchronization type. The synchronization type will be sync/async depending on the async-ness of the function.

absoludity · 2022-10-24T22:38:17Z

OK, thanks James for the thought out response to my vague question - that helps a lot. Having re-read that a few times, I think, whether or not it's possible, what I seem to be trying for or wanting is a cache wrapped in std sync for reads but async for writes. Though I'm not sure it'd even fix my issue (rather than instead increase the blocking io).

Thanks again.

absoludity · 2022-10-25T00:02:58Z

Additionally, cached async functions will use async synchronization types (async-mutex, async-rw-lock) which will introduce additional wait points, but is necessary for the locking to function cooperatively with the async control flow.

Actually, I have another question about this, if you have time (sorry): it's not clear to me why we'd necessarily need an async rw-lock here, unless we're needing to call .await while holding the lock - which we don't necessarily. Isn't the write-lock only required while writing back the value to the cache.

Unless you mean that because cached must use the same function signature, it is therefore forced to be calling .await inside the wrapping RwLock? If that's the only reason, then a manual cache with a sync RwLock should work fine, I think? (Re-reading your comment, that's what you'd actually said - so yes, perhaps I'll try that).

jaemk · 2022-10-25T00:10:24Z

then a manual cache with a sync RwLock should work fine

Yes, that's correct, but as you mentioned you need to be careful not to call any async methods while holding the lock since the executor may "yield" to another task that is trying to acquire the same lock while the lock is still held

absoludity · 2022-10-26T00:55:00Z

FWIW, it does indeed halve the query time for my 50 requests by using a standard sync read-only lock on the cache. Unfortunately though, because the Cached trait requires a mutable reference for for get:

    fn cache_get(&mut self, k: &K) -> Option<&V>;

which, AFAICT, is just for the hit/miss counts, you can't call .get with a read-only lock, so I'm currently having to us my own hash.

Let me know if you think that's an issue (that a mutable ref is required for get) and I'll create an issue and try to get you a PR for it. Having a mutable ref there also stops you from using other patterns for the cache, such as promoting new state rather than mutating existing state (which is the way I'm heading in my issue, for simplicity).

jaemk · 2022-10-26T01:08:57Z

That's an interesting pattern. I'm not sure we can replace everything wholesale with the promoting version though since there's some things like hit/miss counts, certain cache types (lru), and synchronized-writes that rely on mutating the shared value. I would be open to introducing a new macro option though that generates code following this pattern for constrained cases, e.g. any type that could support non-mut gets with the caveat that hit/miss counters will not function and write heavy logic could result in very high memory consumption.

absoludity · 2022-10-26T01:12:21Z

Yeah - the promoting new state pattern was just a side-note, in that I'd be able to use the existing ExpiringCredentialCache struct with that pattern in my own code, if cache_get didn't require &mut self. I simply tried to update that type to use &self instead, but it then can no longer implement the Cached trait, I think.

jaemk · 2022-10-26T01:16:51Z

I think the move here is to create a separate trait CachedNotMutGet (maybe there's a better name) and implement that for any type that can support it to indicate the ones that could be used with a promotion pattern

Signed-off-by: Michael Nelson <minelson@vmware.com>  Follows on from #5518, this time replacing the `cached` package with a custom credential cache. ### Description of the change After further digging, I found that one cause of the slow handling of 50 concurrent requests going through the pinniped-proxy was that: - We were caching a function with an async/await signature which means that even the cached version must have that signature as well - which means a blocking i/o call (which switches the task), and - The `Cached` trait specifies that even a `cache_get` operation mutates the cache (in our case, just for statistics of hits/misses), which, as a result, requires acquiring a *write* lock to the cache to read a cached value. For more details, please see the [discussion with the `Cached` author](jaemk/cached#133). To avoid both of those issues, this PR: 1. Adds a `cache` module that provides a generic read/write `LockableCache` (for multiple readers, single writer) and builds on that with a `PruningCache` that will prune entries (given a test function) when they should no longer be cached, 2. Uses (1) to create a single `CredentialCache` on startup (in `main.rs`) specifically for caching `TokenCredentialRequest` objects and pruning expired entries, and then passes this through for use in different threads concurrently. 3. Uses the cache to fetch the credentials.  ### Benefits Fetching from the cache is now non-blocking (generally, except when an entry is being added) and so leads to less task switching, improving the total query time by ~2s (down to 3-4). There is still something else using significant CPU when creating the client itself (cert-related), which I'm investigating now in a separate PR.  ### Possible drawbacks  ### Applicable issues  - Ref #5407 ### Additional information  Example log when using `RUST_LOG=info,pinniped_proxy::pinniped=debug` which shows the cache being used after the first request. I've not included it in the output generally, but the cache get is now always under a millisecond. As above, the significant delays (some calls to prepare_and_call_pinniped_exchange only 4ms, others 98ms) are what I'll look at next. ``` 2022-10-27T01:42:47.820245 [INFO] - Listening on http://0.0.0.0:3333 2022-10-27T01:43:05.077116 [DEBUG] - prepare_and_call_pinniped_exchange took 17ms. Used cache?: false 2022-10-27T01:43:05.085273 [INFO] - GET https://kubernetes.default/api?timeout=32s 200 OK 2022-10-27T01:43:05.091663 [DEBUG] - prepare_and_call_pinniped_exchange took 5ms. Used cache?: true 2022-10-27T01:43:05.100437 [INFO] - GET https://kubernetes.default/apis?timeout=32s 200 OK 2022-10-27T01:43:05.106005 [DEBUG] - prepare_and_call_pinniped_exchange took 4ms. Used cache?: true 2022-10-27T01:43:05.209952 [DEBUG] - prepare_and_call_pinniped_exchange took 21ms. Used cache?: true 2022-10-27T01:43:05.299424 [DEBUG] - prepare_and_call_pinniped_exchange took 5ms. Used cache?: true 2022-10-27T01:43:05.311599 [DEBUG] - prepare_and_call_pinniped_exchange took 5ms. Used cache?: true 2022-10-27T01:43:05.493269 [DEBUG] - prepare_and_call_pinniped_exchange took 98ms. Used cache?: true 2022-10-27T01:43:05.593683 [DEBUG] - prepare_and_call_pinniped_exchange took 4ms. Used cache?: true 2022-10-27T01:43:05.604348 [DEBUG] - prepare_and_call_pinniped_exchange took 4ms. Used cache?: true 2022-10-27T01:43:05.697828 [DEBUG] - prepare_and_call_pinniped_exchange took 87ms. Used cache?: true 2022-10-27T01:43:05.811590 [DEBUG] - prepare_and_call_pinniped_exchange took 20ms. Used cache?: true 2022-10-27T01:43:06.004358 [DEBUG] - prepare_and_call_pinniped_exchange took 94ms. Used cache?: true 2022-10-27T01:43:06.098603 [DEBUG] - prepare_and_call_pinniped_exchange took 5ms. Used cache?: true 2022-10-27T01:43:06.108756 [DEBUG] - prepare_and_call_pinniped_exchange took 4ms. Used cache?: true ``` Signed-off-by: Michael Nelson <minelson@vmware.com>

absoludity closed this as completed Oct 24, 2022

absoludity mentioned this issue Oct 27, 2022

Switch cache to custom without write-lock for reads. vmware-tanzu/kubeapps#5576

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cache an async function without calling `.await` #133

Cache an async function without calling `.await` #133

absoludity commented Oct 24, 2022 •

edited

Loading

jaemk commented Oct 24, 2022

absoludity commented Oct 24, 2022

absoludity commented Oct 25, 2022

jaemk commented Oct 25, 2022 •

edited

Loading

absoludity commented Oct 26, 2022

jaemk commented Oct 26, 2022

absoludity commented Oct 26, 2022

jaemk commented Oct 26, 2022

Cache an async function without calling .await #133

Cache an async function without calling .await #133

Comments

absoludity commented Oct 24, 2022 • edited Loading

jaemk commented Oct 24, 2022

absoludity commented Oct 24, 2022

absoludity commented Oct 25, 2022

jaemk commented Oct 25, 2022 • edited Loading

absoludity commented Oct 26, 2022

jaemk commented Oct 26, 2022

absoludity commented Oct 26, 2022

jaemk commented Oct 26, 2022

Cache an async function without calling `.await` #133

Cache an async function without calling `.await` #133

absoludity commented Oct 24, 2022 •

edited

Loading

jaemk commented Oct 25, 2022 •

edited

Loading