-
Notifications
You must be signed in to change notification settings - Fork 184
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(iroh): don't hit network for complete blobs&collections #2576
Conversation
Documentation for this PR has been generated and is available at: https://n0-computer.github.io/iroh/pr/2576/docs/iroh/ Last updated: 2024-08-02T14:38:15Z |
…nd do not attempt to download from ourselves
22d5acd
to
8845025
Compare
lgtm, can't approve my own pr 😅 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, but I have limited experience in this part of the code.
One note is that we should document the additions as breaking changes in the Dialer
and Getter
traits.
527bb93
to
cc9fa77
Compare
As discussed in Discord, the previous fix was incorrect as it would report a succeeding download for collections if the root blob is complete even if we had no or only some of the children available locally. I added two commits that fix this:
Also updated the PR description. Please re-review :) |
cc9fa77
to
ccf15d4
Compare
|
||
let mut children: Vec<Hash> = vec![]; | ||
while let Some(hash) = hash_seq.next().await? { | ||
children.push(hash); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why collect first, instead of directly doing the db checks?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because with the PR as-is, we may only emit events here if everything is complete. Otherwise the same events would be emitted again when actually starting the download.
This is suboptimal for sure. See #2586 for a revised approach that does not need to collect events by using a generator and turning get_to_db
into a two-step process.
}, | ||
) | ||
.await? | ||
.await |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should probably check the events in these tests, to make sure they look like we expect them, especially in the collection case
closing in favor of #2586 |
…are complete (#2586) ## Description Two changes to the downloader: * Never try to download from ourselves. If the only provider node added is our own node, fail with error "no providers". * The actual download request flow is turned into a generator (while keeping API compatibility for the existing `get_to_db` public function). A new `get_to_db_in_steps` function either runs to completion if the requested data is fully available locally, or yields a `NeedsConn` struct at the point where it needs a network connection to proceed. The `NeedsConn` has an `async proceed(self, conn: Connection)`, which must be called with a connection for the actual download to start. This two-step process allows the downloader to check if we should dial nodes at all, or are already done without doing anything, while emitting the exact same flow of events (because we run the same loop) to the client. To achieve this, `get_to_db` now uses a genawaiter generator internally. This means that the big loop that is the iroh-blobs protocol request flow does not have to be changed at all, only that instead of a closure we yield and resume, which makes this much easier to integrate into an external state machine like the downloader. The changes needed for this for the downloader are a bit verbose because the downloader itself is generic over a `Getter`, with impls for the actual impl and a test impl that does not use networking; therefore the new `NeedsConn` state has to be modeled with an additional associated type and trait here. This PR adds three tests: * Downloading a missing blob from the local node fails without trying to connect to ourselves * Downloading an existing blob succeeds without trying to download * Downloading an existing collection succeeds without trying to download Closes #2575 Replaced #2576 ## Notes and open questions ## Breaking changes None, only an API addition to the public API of iroh_blobs: `iroh_blobs::get::check_local_with_progress_if_complete` --------- Co-authored-by: dignifiedquire <me@dignifiedquire.com>
Description
Two changes to the downloader:
The second point turned out to be a bit more complicated:
FoundLocal
events so that the state tracking at the client works the same no matter if an actual download was performed or not.Both these things are now implemented, through a new function
check_local_with_progress_if_complete
in iroh_blobs that is called before starting the download. It checks if the blob is fully available locally (and all children too for hashseqs) and if so returnstrue
, and emits theFoundLocal
events. No events are emitted if the blob is not completely available, as in that case we start the actual download machinery, which will emit theFoundLocal
events too on the go.Adds three tests:
Closes #2575
Notes and open questions
The API around the new function and the exisitng
get_to_db
could be improved, and we could skip the double-checking, but that would be a more involved change to a public API that I didn't want to make.Breaking changes
None, only an API addition to the public API of iroh_blobs:
iroh_blobs::get::check_local_with_progress_if_complete