Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not cancel server fetch of watcher on dependent key update #1156

Conversation

flobories
Copy link
Contributor

This is an attempt at solving #1155
By only canceling fetches when they are replaced by a server fetch, we are making sure that cache updates do not cancel any inflight fetches.

@apollo-cla
Copy link

@flobories: Thank you for submitting a pull request! Before we can merge it, you'll need to sign the Apollo Contributor License Agreement here: https://contribute.apollographql.com/

Copy link
Contributor

@designatednerd designatednerd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Woof. Thanks for the detailed report and the PR. One minor comment.

Sources/Apollo/GraphQLQueryWatcher.swift Outdated Show resolved Hide resolved
@RolandasRazma
Copy link
Contributor

RolandasRazma commented Apr 15, 2020

hmm, not sure if this is correct solution. Wouldn't it be better to change signature to fetch(cachePolicy: CachePolicy, cancelInFlight: Bool = true) and do fetch(cachePolicy: .returnCacheDataElseFetch, cancelInFlight: false) from cache updates?

@RolandasRazma
Copy link
Contributor

also this introduces bit uncertainty in GraphQLQueryWatcher.cancel() - what are we canceling?

@RolandasRazma
Copy link
Contributor

thinking about both cases above I think better way is to figure out solution where canceling fetch won't call resultHandler when it's done from cache updates, or maybe not call resultHandler at all for cancelations? - what are we gaining there?.

@designatednerd
Copy link
Contributor

@RolandasRazma I don't know how we'd infer the correct fetch policy when cancelling from the cache though. What if it's .returnCacheDataAndFetch rather than elseFetch?

Note that this also just calls cancel straight on the cancellable that's tied to the network request - it doesn't have anything to do with the cancel on the watcher itself.

Does that make more sense?

@RolandasRazma
Copy link
Contributor

the problem is that
watcher = GraphQLQueryWatcher()
watcher.fetch()
watcher.fetch()
watcher.cancel()

will have one success and one cancelation callback, and that might lead to very confusing results unless you know whole system

@designatednerd
Copy link
Contributor

Wait, I don't think I understand why that would happen. Can you explain:

a) Why this would return one and one?
b) What you expect would happen instead?

@RolandasRazma
Copy link
Contributor

RolandasRazma commented Apr 15, 2020

a)
watcher = GraphQLQueryWatcher(...) { result: Result<GraphQLResult, Error>
print(result)
}
watcher.fetch() // Fetch A
watcher.fetch() // Fetch B
watcher.cancel()

Fetch B is canceled and return error as "canceled" (ref to fetching B)
Fetch A will succeed after some time (if this PR accepted) and will return data (we lost ref to fetching A)

b)
that's a good question "what is expected behaviour?". Reading code above my first thought would be "oh, so both fetches will go in parallel and "cancel" will cancel them both at the same time" (as there no way to specify witch one to cancel)
second thought would be behaviour we have now, where second fetch "replaces" first one

@designatednerd
Copy link
Contributor

OK, thank you very much for clarifying. @flobories What's your take on @RolandasRazma's concerns?

@flobories
Copy link
Contributor Author

flobories commented Apr 15, 2020

I see you concern @RolandasRazma about having two different fetches in parallel, only one being cancelable because only one fetching reference.

The main concern I'm trying to address is a server fetch being canceled by a cache update, when I'd prefer the server fetch to reach completion uninterrupted as I did not explicitly cancel it.
Do you see a way around this problem without introducing parallel fetches in the watcher?

  • If we are ok with 2 concurrencies, we could introduce the concept of distinct storeUpdateFetching and refetching Cancelables instead of a single fetching, and have store updates only cancel the storeUpdateFetching, similary refetches would only cancel the refetching.
    The public cancel() would cancel both. This could still be tricky because it would return two canceled errors I believe?
  • If we want to keep only 1, then I believe we need to prioritize cache updates or server fetches either by dropping some or by chaining them. I feel this may be harder to nail? As a user of the library, I would expect cache updates to be "instant", so if a cache update happens while a server fetch is in flight we'd need to
    1. Cancel the server fetch
    2. Apply the cache fetch
    3. Refetch from server once cache update is done
      This would overall take more time? And lead to some cancelation errors?

Does this make sense?

@RolandasRazma
Copy link
Contributor

your last scenario is what I was describing, where canceling from cache hides server fetch cancelation and "user of the library" never even knows that his request was canceled and restarted.

Implementation as it stands in this PR will lead to bugs where:
You have watcher on object
You start "fetch A" for that object
You start mutation on server and fetching same object, update it in cache (from mutation)
If your "fetch A" returns later than mutation, your object in cache will be reverted to "incorrect" one (as your fetch started before mutation). It will go "old object" -> "new object" (mutation) -> "old object" ("fetch A" returns)

this requires weird internet condition, but this kind of race conditions impossible to find when you need to fix

@designatednerd
Copy link
Contributor

AHA, I think I found a big piece of my confusion, which is partly because I'm noodling with some stuff with a delegate-based URL session: The callback-based APIs will always puke up an error if something is cancelled, but the way I've got it set up in my delegate-based one I'm killing off completion blocks before the cancellation error hits the delegate.

I feel like maybe we should be performing the fetch directly on the store that comes in, and only hitting that fetch that creates a cancellable if there's actually no results in the cache. Then we could just straight up cancel if something calls fetch again.

What do you guys think about that?

@designatednerd
Copy link
Contributor

(FYI, I'm converting this to a draft so I don't brain-dead merge it tomorrow morning before we finish this discussion)

@designatednerd designatednerd marked this pull request as draft April 16, 2020 02:27
@designatednerd
Copy link
Contributor

@RolandasRazma @flobories Thoughts?

@RolandasRazma
Copy link
Contributor

sorry I'm not that familiar with codebase to give credible opinion

@flobories
Copy link
Contributor Author

I like your solution @designatednerd! Store fetches and main fetches/refetches would be siloed!

Two comments/questions:

  • We would not create/track a cancelable from the store store.load(query:) in store(_store: didChangeKeys: context: ), right? Could there still be a tight race condition where watcher.cancel() is called but the store load is in progress and not canceled? I believe that's what @RolandasRazma is concerned about.

and only hitting that fetch that creates a cancellable if there's actually no results in the cache

  • When would this happen? If a cache dependent key is hit before an initial fetch of the watcher's query? Would that initial fetch not be either in progress or at the client developer's discretion? Or do you feel a refetch is mandated because the initial fetch would be invalidated in flight by the cache dependent key being modified?

Your solution would definitely solve my issue.
I can take a stab at a POC implementation later today if you'd like, unless you feel it's too tricky for a developer less familiar with the codebase.

@designatednerd
Copy link
Contributor

@RolandasRazma You know this part of the codebase at least as well as I do, i've been focused on other stuff 🙃

You can't actually cancel a store load, so there's nothing really to track.

If the entire watcher is cancelled while the store load is happening, then the completion callback should be nilled out in the cancellation and then we wouldn't call it when the store load finishes.

I think you'd only have the cache key create a new fetch if you get that dependent keys changed notification after something got cleared out - then the cache would have nothing and it'd force a re-fetch.

Like I've said repeatedly, this is extremely complex, to the point that I've been working on this lib for 10 months and still really haven't totally wrapped my head around every possibility here. @flobories the fact that you were able to even identify this problem indicates you're probably in an ok spot to take a stab at it with feedback from me and @RolandasRazma. So yes, POC is welcome!

@flobories
Copy link
Contributor Author

Will do, thanks for the feedback @designatednerd !

@flobories
Copy link
Contributor Author

(actually tied up with other projects at work, will follow up tomorrow 🙃)

@designatednerd
Copy link
Contributor

@flobories Were you ever able to get another look at this?

@flobories
Copy link
Contributor Author

I have not 😬 but it’s been on my mind crushing me with guilt 🤣
I just shipped the feature that has been getting my attention these past couple weeks and this week is my catchup on everything else one!
I’ll prioritize to move this forward before EOW.
I appreciate your patience, and thanks for the nudge!

@designatednerd
Copy link
Contributor

Dude, no guilt! Just checking in to see if you were still interested in moving this forward. It's a gnarly problem to solve, so I very much appreciate that you are interested in solving it rather than me 😆

@flobories flobories changed the title Cancel watcher fetch only if new fetch includes server Do not cancel server fetch of watcher on dependent key update May 12, 2020
Copy link
Contributor Author

@flobories flobories left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@designatednerd, thanks for your patience! Here is my second attempt.
I believe it should not change the behavior much, only the case that I am having issues with.

As you suggested, this replaces the the main .returnCacheDataElseFetch reload with a two step "manual" store load - which does not interrupt any running fetching - followed by a .fetchIgnoringCacheData if the store load fails - which it should if data is missing.

The only change in behavior I am wondering about is threading, since we are introducing one extra level of asynchronicity here in the double load + fetch, but I believe it should all stem from the same store queue that handles store transactions / publishing key changes, so I believe it is correct?

What do you think?

I tested this branch in my app and it resolves my issue!

Sources/Apollo/GraphQLQueryWatcher.swift Outdated Show resolved Hide resolved
@flobories flobories marked this pull request as ready for review May 12, 2020 01:12
Copy link
Contributor

@designatednerd designatednerd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I let this sit for a bit! Just a couple comments, otherwise I think this is looking good

Sources/Apollo/GraphQLQueryWatcher.swift Outdated Show resolved Hide resolved
Sources/Apollo/GraphQLQueryWatcher.swift Outdated Show resolved Hide resolved
@flobories
Copy link
Contributor Author

Thanks for the review @designatednerd! Just pushed the updates you requested.

@designatednerd
Copy link
Contributor

Looks good on my end - @RolandasRazma since you caught the issue with the original, wanna double check that I'm not missing something here?

Sources/Apollo/GraphQLQueryWatcher.swift Outdated Show resolved Hide resolved
}
case .failure:
// If the cache fetch is not successful, for instance if the data is missing, refresh from the server.
self.fetch(cachePolicy: .fetchIgnoringCacheData)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is .fetchIgnoringCacheData correct here? it was .returnCacheDataElseFetch before.
Can we get into some fetch cycle here? fetch(cachePolicy: CachePolicy) -> Fail -> fetch(cachePolicy: CachePolicy) -> Fail -> ...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before it was returnCacheDataElseFetch because we weren't explicitly trying to load from the cache. By the point this gets hit, we've already checked the cache so fetchIgnoringCacheData acknowledges that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also the fail is for reading directly from the store, rather than for hitting the network, so that cycle should not be an issue.

@designatednerd
Copy link
Contributor

OK I'm gonna go ahead and merge this - @flobories @RolandasRazma Thanks so much for all your help!

@designatednerd designatednerd merged commit 31c5339 into apollographql:master May 19, 2020
@designatednerd designatednerd added this to the Next Release milestone May 19, 2020
@flobories
Copy link
Contributor Author

Thanks @designatednerd for your help and patience! Feels great to reach the finish line here!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants