Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
net/libp2p: Enforce outbound request-response timeout limits (#7222)
This PR enforces that outbound requests are finished within the specified protocol timeout. The stable2412 version running libp2p 0.52.4 contains a bug which does not track request timeouts properly: - libp2p/rust-libp2p#5429 The issue has been detected while submitting libp2p -> litep2p requests in kusama. This aims to check that pending outbound requests have not timedout. Although the issue has been fixed in libp2p, there might be other cases where this may happen. For example: - libp2p/rust-libp2p#5417 For more context see: #7076 (comment) 1. Ideally, the force-timeout mechanism in this PR should never be triggered in production. However, origin/stable2412 occasionally encounters this issue. When this happens, 2 warnings may be generated: - one warning introduced by this PR wrt force timeout terminating the request - possible one warning when the libp2p decides (if at all) to provide the response back to substrate (as mentioned by @alexggh [here](/~https://github.com/paritytech/polkadot-sdk/pull/7222/files#diff-052aeaf79fef3d9a18c2cfd67006aa306b8d52e848509d9077a6a0f2eb856af7L769) and [here](/~https://github.com/paritytech/polkadot-sdk/pull/7222/files#diff-052aeaf79fef3d9a18c2cfd67006aa306b8d52e848509d9077a6a0f2eb856af7L842) 2. This implementation does not propagate to the substrate service the `RequestFinished { error: .. }`. That event is only used internally by substrate to increment metrics. However, we don't have the peer information available to propagate the event properly when we force-timeout the request. Considering this should most likely not happen in production (origin/master) and that we'll be able to extract information by warnings, I would say this is a good tradeoff for code simplicity: /~https://github.com/paritytech/polkadot-sdk/blob/06e3b5c6a7696048d65f1b8729f16b379a16f501/substrate/client/network/src/service.rs#L1543 ### Testing Added a new test to ensure the timeout is reached properly, even if libp2p does not produce a response in due time. I've also transitioned the tests to using `tokio::test` due to a limitation of [CI](/~https://github.com/paritytech/polkadot-sdk/actions/runs/12832055737/job/35784043867) ``` --- TRY 1 STDERR: sc-network request_responses::tests::max_response_size_exceeded --- thread 'request_responses::tests::max_response_size_exceeded' panicked at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.40.0/src/time/interval.rs:139:26: there is no reactor running, must be called from the context of a Tokio 1.x runtime ``` cc @paritytech/networking --------- Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io> Co-authored-by: Bastian Köcher <git@kchr.de>
- Loading branch information