Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bitswap: move providing -> Exchange-layer, providerQueryManager -> routing #641

Merged
merged 1 commit into from
Nov 25, 2024

Conversation

aschmahmann
Copy link
Contributor

fixes #640

This is an attempt at splitting the ProviderQueryManager out of Bitswap so that we can configure it more in consumers.

@aschmahmann aschmahmann force-pushed the feat/configurable-providequerymanager branch from a6677c4 to f538d1f Compare July 26, 2024 11:06
Copy link

codecov bot commented Jul 26, 2024

Codecov Report

Attention: Patch coverage is 81.09453% with 38 lines in your changes missing coverage. Please review.

Project coverage is 60.35%. Comparing base (c91cc1d) to head (6d9cc8b).
Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
...uting/providerquerymanager/providerquerymanager.go 83.54% 12 Missing and 1 partial ⚠️
bitswap/client/client.go 84.44% 6 Missing and 1 partial ⚠️
exchange/providing/providing.go 62.50% 4 Missing and 2 partials ⚠️
bitswap/options.go 0.00% 4 Missing ⚠️
bitswap/client/internal/session/session.go 50.00% 2 Missing and 1 partial ⚠️
bitswap/network/ipfs_impl.go 62.50% 2 Missing and 1 partial ⚠️
bitswap/testnet/virtual.go 83.33% 1 Missing ⚠️
provider/noop.go 0.00% 1 Missing ⚠️

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #641      +/-   ##
==========================================
- Coverage   60.43%   60.35%   -0.08%     
==========================================
  Files         243      244       +1     
  Lines       31059    31034      -25     
==========================================
- Hits        18771    18732      -39     
- Misses      10628    10633       +5     
- Partials     1660     1669       +9     
Files with missing lines Coverage Δ
bitswap/bitswap.go 69.13% <100.00%> (+2.07%) ⬆️
...tswap/client/internal/messagequeue/messagequeue.go 84.92% <ø> (ø)
bitswap/server/server.go 56.57% <100.00%> (-5.39%) ⬇️
bitswap/testinstance/testinstance.go 86.44% <100.00%> (+0.23%) ⬆️
bitswap/testnet/peernet.go 38.46% <100.00%> (-4.40%) ⬇️
blockservice/test/mock.go 100.00% <100.00%> (ø)
examples/bitswap-transfer/main.go 41.21% <100.00%> (ø)
provider/provider.go 25.74% <ø> (ø)
provider/reprovider.go 68.63% <100.00%> (+0.59%) ⬆️
routing/mock/centralized_client.go 58.53% <100.00%> (-3.97%) ⬇️
... and 9 more

... and 4 files with indirect coverage changes

@aschmahmann aschmahmann force-pushed the feat/configurable-providequerymanager branch from f538d1f to e95eeb2 Compare July 26, 2024 11:11
Comment on lines 81 to 86
FindProvidersAsync(ctx context.Context, k cid.Cid) <-chan peer.ID
}

type FindAllProviders struct {
Router bsnet.BitSwapNetwork
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're going to keep the function here as not taking a count the name should get changed from FindProvidersAsync to one that doesn't collide with the more widely used function that does take a count so implementations can use both if they want.

Comment on lines 91 to 116
provCh := make(chan peer.ID)
wg := &sync.WaitGroup{}
for p := range providers {
wg.Add(1)
go func(p peer.ID) {
defer wg.Done()
span.AddEvent("FoundProvider", trace.WithAttributes(attribute.Stringer("peer", p)))
err := r.Router.ConnectTo(ctx, p)
if err != nil {
span.RecordError(err, trace.WithAttributes(attribute.Stringer("peer", p)))
log.Debugf("failed to connect to provider %s: %s", p, err)
return
}
span.AddEvent("ConnectedToProvider", trace.WithAttributes(attribute.Stringer("peer", p)))
select {
case provCh <- p:
case <-ctx.Done():
return
}
}(p)
}
go func() {
wg.Wait()
close(provCh)
}()
return provCh
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: this was copied out of the ProviderQueryManager because the internals (or at least the tests) seem to rely on having the connections made explicit here. It seems reasonable to extract this up a level out of the routing component though.

@gammazero
Copy link
Contributor

gammazero commented Oct 28, 2024

@aschmahmann Does this mean we still want the provider query manager, and need to close #536 which removes it.

@lidel lidel requested a review from hsanjuan November 12, 2024 18:01
Copy link
Contributor

@hsanjuan hsanjuan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this comes with the need not to replace the default provider itself, but to be able to create it by hand and turn the knobs (and guessing it's mostly about increasing the amount of workers).

It is similar to #535 although that includes some other things (FindProvidersAsync returns AddrInfos instead of PeerIDs, and Router and Network are separate things).

If we need it we can merge this, otherwise the direction of #535 seems better but I guess it's a more invasive change (haven't followed too much). Code-wise I don't see anything weird, so green light on that front.

Comment on lines +102 to +112

func WithClientOption(opt client.Option) Option {
return Option{opt}
}

func WithServerOption(opt server.Option) Option {
return Option{opt}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was an unrelated change and should be a separate PR, I'll raise it separately. TLDR though is that bitswap.New can take options for the client or the server, but right now the only way to construct them is if they're forwarded as regular Bitswap options, otherwise you have to construct the client and server yourself. This saves on duplication while being more explicit and was something I ran into here because a new client option was introduced.

Comment on lines 105 to 110
func WithDefaultLookupManagement(b bool) Option {
return func(bs *Client) {
bs.useDefaultLookupManagement = true
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In #535 , the Network is separated from the Router and that makes sense to me. As a user, if I want to use a custom Router, I am going to have to glue the normal Network (ConnectTo) and my custom Router (FindProvidersAsync) into a custom BitswapNetwork. I'm not sure if that requires much more refactoring though, given that #535 is based on other additional changes.

This would also allow to disable the ProviderQueryManager altogether when it is not passed in (here it's impossible).

hsanjuan
hsanjuan previously approved these changes Nov 15, 2024
Comment on lines -377 to -392
func (bsnet *impl) FindProvidersAsync(ctx context.Context, k cid.Cid, max int) <-chan peer.ID {
out := make(chan peer.ID, max)
go func() {
defer close(out)
providers := bsnet.routing.FindProvidersAsync(ctx, k, max)
for info := range providers {
if info.ID == bsnet.host.ID() {
continue // ignore self as provider
}
bsnet.host.Peerstore().AddAddrs(info.ID, info.Addrs, peerstore.TempAddrTTL)
select {
case <-ctx.Done():
return
case out <- info.ID:
}
}
}()
return out
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of this is not needed because our query manager is doing Connect(peerAddr) instead of ConnectTo(peerID). Addresses will be absorbed into the peerstore directly. At his point BSnet is only a convenience to have content router available where it is not passed explicitally.

span.AddEvent("FoundProvider", trace.WithAttributes(attribute.Stringer("peer", p)))
err := pqm.network.ConnectTo(findProviderCtx, p)
span.AddEvent("FoundProvider", trace.WithAttributes(attribute.Stringer("peer", p.ID)))
err := pqm.dialer.Connect(findProviderCtx, p)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The beauty is this can now be a libp2p.Host directly. No longer need to rely on the network having absorbed provider peer addresses into the peerstore prior to connecting because we have them here already.

pqm := bspqm.New(ctx, network)
if bs.pqm == nil { // not set with the options
// network can do dialing and also content routing.
pqm, err := rpqm.New(ctx, network, network, rpqm.WithMaxProviders(10))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we don't want to pass-in a Host and a ContentRouter directly to the Client, we need to rely on Network, which has them, but it would be better to remove Content Routing capabilities from the Network because it is just wrapping the content Router and nothing else at this point.

@hsanjuan
Copy link
Contributor

lol I meant to just comment, not approve my own changes

Copy link
Contributor Author

@aschmahmann aschmahmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @hsanjuan for the changes. Looks reasonable to me, left a couple questions

bitswap/client/client.go Outdated Show resolved Hide resolved
@@ -51,7 +51,7 @@ func (c *client) FindProvidersAsync(ctx context.Context, k cid.Cid, max int) <-c
go func() {
defer close(out)
for i, p := range c.server.Providers(k) {
if max <= i {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an unrelated fix due to this code not understanding that 0 means infinity, right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know, you did this change 🤷‍♂️

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, lol. Yeah, that's why it's there. I was noticing some weirdness with testing.

bitswap/network/ipfs_impl.go Show resolved Hide resolved
@hsanjuan
Copy link
Contributor

hsanjuan commented Nov 15, 2024

Sync with @aschmahmann :

  • ProviderQueryManager looks a lot like a ContentRouter itself.
  • Hector proposes removing the ContentRouter (CR) part from BitSwapNetwork and passing it directly to Bitswap.New(). This is a breaking change (people need to pass CR to Bitswap.New() rather than to the bsnet.New().
  • Adin wants to make ContentRouter completely optional in Bitswap Client, after removing from Server. That provides further justification to break bsnet and Bitswap.New() behaviour.
  • When ContentRouter is provided, it might already be a disguised ProviderQueryManager, so users could:
    • Use no ContentRouting
    • Use their custom ContentRouting
    • Use ProviderQueryManager
    • This means users would need to remember to manually create ProviderQueryManager on upgrade, to keep things working as before. That's shitty and they will likely forget.
    • Alternatively bitswap can:
      • Receive a ContentRouting
      • Also have an option about whether that content routing should be wrapped in a default ProviderQueryManager or not. Defaulting to yes.

I think this is the gist @aschmahmann . I think the last option would be ok (there's breakage but it both a) has a stronger reason b) is easy to fix.

@aschmahmann
Copy link
Contributor Author

I think the last option would be ok (there's breakage but it both a) has a stronger reason b) is easy to fix.

Do you mean to do this?

  1. Remove the Providing out of the Bitswap server (e.g. something like Move providing responsabilities from bitswap to blockservice #677)
  2. Break the bitswap.Network by removing the ContentRouting
  3. Add two options to bitswap.Client, 1 for adding a ProviderFinder and another for whether the default ProviderQueryManager should be used

Existing consumers would migrate by moving their ContentRouter from bitswap.Network into the bitswap.Client option

If so, that sounds reasonable to me. If you think pulling out the Providing is too much to do here we could also do something like:

  1. Break the bitswap.Network by removing the ContentRouting
  2. Add two options to bitswap.Bitswap, 1 for adding a ContentRouting and another (in bitswap.Client) for whether the default ProviderQueryManager should be used
    ...
    In a follow up PR:
  3. Remove the Providing out of the Bitswap server
  4. Change the bitswap.Bitswap option to forward to a bitswap.Client option and pass a ProviderFinder instead of a ContentRouter (is just a subset so nothing should break)

@hsanjuan
Copy link
Contributor

hsanjuan commented Nov 18, 2024

  1. Remove the Providing out of the Bitswap server (e.g. something like Move providing responsabilities from bitswap to blockservice #677)
  2. Break the bitswap.Network by removing the ContentRouting
  3. Add two options to bitswap.Client, 1 for adding a ProviderFinder and another for whether the default ProviderQueryManager should be used

ok, currently we are here... see my review for a couple of things

bitswap/client/client.go Outdated Show resolved Hide resolved
bitswap/client/client.go Outdated Show resolved Hide resolved
@@ -254,20 +254,20 @@ func (nc *networkClient) Stats() bsnet.Stats {
}

// FindProvidersAsync returns a channel of providers for the given key.
func (nc *networkClient) FindProvidersAsync(ctx context.Context, k cid.Cid, max int) <-chan peer.ID {
func (nc *networkClient) FindProvidersAsync(ctx context.Context, k cid.Cid, max int) <-chan peer.AddrInfo {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be removed now, but more important, we need to wire the mock content-router to the tests at a higher level (instance generator etc) as it is no longer in the network. currently the tests work without content routing.

@hsanjuan hsanjuan force-pushed the feat/configurable-providequerymanager branch 2 times, most recently from 6c00392 to 7359d9f Compare November 19, 2024 15:39
@@ -114,6 +114,7 @@ The following emojis are used to highlight certain changes:
- `bitswap/client` fix memory leak in BlockPresenceManager due to unlimited map growth. [#636](/~https://github.com/ipfs/boxo/pull/636)
- `bitswap/network` fixed race condition when a timeout occurred before hole punching completed while establishing a first-time stream to a peer behind a NAT [#651](/~https://github.com/ipfs/boxo/pull/651)
- `bitswap`: wantlist overflow handling now cancels existing entries to make room for newer entries. This fix prevents the wantlist from filling up with CIDs that the server does not have. [#629](/~https://github.com/ipfs/boxo/pull/629)
- 🛠 `bitswap` & `bitswap/server` no longer provide to content routers, instead you can use the `provider` package because it uses a datastore queue and batches calls to ProvideMany.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reminder to fix this before merge

@@ -18,7 +18,7 @@ var logR = logging.Logger("reprovider.simple")
// Provider announces blocks to the network
type Provider interface {
// Provide takes a cid and makes an attempt to announce it to the network
Provide(cid.Cid) error
Provide(context.Context, cid.Cid, bool) error
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have updated this interface so that it matches the signature from ContentRouting, so that we can drop this interface altogether if they merge libp2p/go-libp2p#3048

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was merged, now wait for libp2p release

)

// NewTestInstanceGenerator generates a new InstanceGenerator for the given
// testnet
func NewTestInstanceGenerator(net tn.Network, netOptions []bsnet.NetOpt, bsOptions []bitswap.Option) InstanceGenerator {
func NewTestInstanceGenerator(net tn.Network, routing mockrouting.Server, netOptions []bsnet.NetOpt, bsOptions []bitswap.Option) InstanceGenerator {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to pass it in from tests because some tests do want to share a mock router among the different things that use it (Blockservice/Instance etc).

@@ -84,10 +88,11 @@ func ConnectInstances(instances []Instance) {

// Instance is a test instance of bitswap + dependencies for integration testing
type Instance struct {
Peer peer.ID
Identity tnet.Identity
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a test that needs the identity to create a mock client provider on a different provider server to test that blocks provided on a separate network are not received etc etc. so this was necessary.

bitswap/client/bitswap_with_sessions_test.go Show resolved Hide resolved
Comment on lines +201 to +208
// careful when bs.pqm is nil. Since we are type-casting it
// into session.ProviderFinder when passing it, it will become
// not nil. Related:
// https://groups.google.com/g/golang-nuts/c/wnH302gBa4I?pli=1
var pqm bssession.ProviderFinder
if bs.pqm != nil {
pqm = bs.pqm
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Allowing to pass nil parameters is not awesome either...

bitswap/client/internal/session/session.go Outdated Show resolved Hide resolved
blockservice/blockservice.go Outdated Show resolved Hide resolved
blockservice/providing_blockstore.go Outdated Show resolved Hide resolved
@hsanjuan hsanjuan force-pushed the feat/configurable-providequerymanager branch 3 times, most recently from 94d755b to 0ca70f4 Compare November 20, 2024 16:18
@hsanjuan
Copy link
Contributor

@aschmahmann @gammazero I have a question: I was going to port the "Provider" logic from bitswap server, with its workers and queues and what not. Then I saw that we have a provider.System (the reprovider) that does have queues, does batching and even stores pending provides to disk. Naively my thought is that, if such logic is already implemented in the reprovider, a providing.Exchange should not yet have its own way and we should just pass-in the reprovider when using it. Plus I suspect that we make use of things like Accelerated-DHT-providing, ProvideMany, so why going around...

What do you think?

@aschmahmann
Copy link
Contributor Author

What do you think?

Makes sense to me to leave it out. It's unclear how useful the parallel queues were beforehand, but it seems like the kind of thing that's situational enough that people can choose to add whatever queues they need and we should start off keeping it simple and encouraging reuse of the existing queue.

@hsanjuan hsanjuan force-pushed the feat/configurable-providequerymanager branch from b7c5f16 to 19cd67f Compare November 21, 2024 15:32
@hsanjuan hsanjuan marked this pull request as ready for review November 21, 2024 16:19
@hsanjuan hsanjuan requested review from lidel and a team as code owners November 21, 2024 16:19
@hsanjuan
Copy link
Contributor

ipfs/kubo#10595 is testing this.

gateway/backend_blocks.go Outdated Show resolved Hide resolved
@gammazero gammazero self-requested a review November 25, 2024 16:54
@gammazero gammazero self-assigned this Nov 25, 2024
@hsanjuan hsanjuan changed the title configurable providerquerymanager Bitswap: move providing -> Exchange-layer, providerQueryManager -> routing Nov 25, 2024
This PR performs a rather large and touchy refactor of things related to
Content providing and Content discovery previously embedded into Bitswap.

The motivations:

  * Make ProviderQueryManager options configurable

  * Align and separate coalesced layers: content routing must not be part of
  bitswap as in the future we will be using different exchanges (bitswap, http)
  for retrieval and content routing should be above exchange.

  * Align content routing interfaces with libp2p: to avoid crust, wrappers and
    user confusion, align Providers and Discovery types to
    libp2p.ContentRouting.

  * Reduce duplicated functionality: i.e. code that handles providing in
  multiple places and fails to take advantage of ProvideMany optimizations.

As a result:

  * ProviderQueryManager is now part of the routing module

  * A new providing.Exchange has been created

  * Bitswap initialization params have changed and Bitswap Network doesn't
    provide anymore (see changelog for more details)

Co-authored-by: Andrew Gillis <11790789+gammazero@users.noreply.github.com>
Co-authored-by: Adin Schmahmann <adin.schmahmann@gmail.com>
@hsanjuan hsanjuan force-pushed the feat/configurable-providequerymanager branch from 8b304f2 to 6d9cc8b Compare November 25, 2024 17:21
@hsanjuan
Copy link
Contributor

Squashed

@hsanjuan hsanjuan dismissed their stale review November 25, 2024 20:34

self review

@gammazero gammazero merged commit 37756ce into main Nov 25, 2024
15 checks passed
@gammazero gammazero deleted the feat/configurable-providequerymanager branch November 25, 2024 21:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Remove requirement on ProviderQueryManager from the bitswap client
3 participants