-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DO NOT MERGE] Test issue on Linux due to update. #540
Conversation
After adding logs into the relevant crates for transaction handling, which are The key piece of logging is the following:
This happens on the authorsip module, where extrinsics are applied and added to the block. For reference, when transactions are included we see:
In that second stream, 1 transaction got included while the other one did not. How much is the deadline? We define the authorship for the manual seal here and the propose factory here. Both of which are relevant for proposing using the Function Internally, propose calls The question is why is this deadline being met so quickly? This would also explain in theory why it can fail "sometimes", since it will depend on the execution time between those processes, which is system dependent. |
@gianfra-t What is the purpose of proposing a transaction? |
What do you mean @TorstenStueber ? Functions |
128aef1
to
f0e1bdc
Compare
I am not familiar with the code but this seems to use manual seal on a chain that runs locally for testing purposes? Is it important to put multiple transactions into a block? Can we not just submit transactions the classical way and won't run into this problem with the Proposer (I say this with very superficial knowledge)? |
18fe89a
to
3b8a104
Compare
3b8a104
to
37c6f18
Compare
What's the classical way for you? Authoring blocks every 6 to 12 seconds without manual seal?
But we do know that it never fails for the integration tests and it also fails if you just run one of the two tests you mentioned in the description in isolation. So I doubt that it's due to resource consumption or processing power of the node. We also know that it doesn't happen on macOS. What if something goes wrong when comparing the current time to the deadline? This wouldn't explain why it works for the integration tests though. Since we can't touch the code of the transaction-pool and authorship pallets (without forking), @gianfra-t have you tried comparing our configuration for the manual seal with eg. the one interlay uses? Maybe there is a discrepancy and we can solve the issue by changing some of the parameters. |
I do not know why this doesn't happen also on the integration tests. If it has the exact same manual seal configuration, then it may be that since the chain is generating blocks all the time, the transactions have a chance to be re-included in a block. On these failing tests, there is no new block creation after 1 or 2 blocks. Although I tried to create new blocks as a potential solution for this and that didn't work either... I will try to compare some manual seal configurations because there is a big chance that changing that can solve this. I would like to delay the block proposing process, ideally. Because even like @TorstenStueber said about sending only 1 transaction, sometimes the deadline is being met even after the inclusion of that single one!. On the latest ci logs of this PR we can see how the first log for the relevant times (now and deadline) which correspond to the initiation of the sealing process happens way too early before the applying process takes place. |
But in the integration tests, we are using manual seal as well so the blocks are not generated all the time either. For every test, we are launching a separate chain so that the tests don't influence each other and each one of them has a 'clean environment'. Correct me if I'm wrong or misunderstanding the problem, but the authorship pallet rejects including the transaction into the block because it probably thinks that the transaction is too large and would require more time for processing than is available in the block (?) |
Yes, but in the integration tests transactions are being sent asynchronously all the time, and every time a transaction gets into the pool the node (with manual seal) will start another block, otherwise it won't. Or do you say that every test on the integration also waits for the transaction to be finalized? I do not know enough about those tests. And regarding the rejection what I understand from the logs and the corresponding line where the deadline is checked, is that the the transaction gets fetched from the pool but is not even attempted to be executed. Even the estimation happens later. And Interlay, for instance, seems to have the same config :/ (That would be |
Hmm you are right, the size of the transaction couldn't be it. Thanks for sharing the links. This configuration is different though, maybe it makes a difference? |
Mmmm interesting. What changes is the inherent provider. I think those are applied before the extrinsics, but if ours takes a long time it could make the difference. Worth the shot! |
89677c7
to
04b57ed
Compare
04b57ed
to
60925d8
Compare
7cb4fbc
to
0422c1f
Compare
0422c1f
to
dc97520
Compare
28ee579
to
95b9145
Compare
95b9145
to
f29aa46
Compare
73f96ea
to
4f49a1f
Compare
4f49a1f
to
50ff2b9
Compare
clients/runtime/src/tests.rs
Outdated
#![allow(dead_code)] | ||
#![allow(unused_imports)] | ||
|
||
#![allow(dead_code)] | ||
#![allow(unused_imports)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need these declarations twice.
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: remove
* update dependencies, fixing compilation errors (WIP) * update pallets to 1.1.0 * fix subxt feature flags * fix client configuration new fields * fix clippy issues and missing std flag * update fixes and missing configs for runtimes * fixes to node configuration (wip) * node client fixes * bump subxt to minimum viable version * use static wrapper for account_id and currency_id metadata replacement * use account_id struct from subxt implementing Encode and Decode as type to avoid Static wrapper * fix account type conversion using new subxt type * missing flag, cleanup unused types * temprary remove conditional metadata * use foucoco metadata only * warnings and type conversion * re-implement hashable trait * replace account_id on tests * use subxt version 0.29 across crates * modification of service config values * testing increase in timeout for tests * prints for testing * bump to subxt 0.30.0 * temp remove unrelated jobs * work with ubuntu latest * bump to subxt 0.33 seal block periodically seal blocks more often seal blocks more often block finalization force hack * local linux tests * testing finalizing first block solution * setup working locally * test only integration on ci * cleaning up temporary changes * small code improvements * more comment removal * use ubuntu latest os for main test * Use different function to set exchange rate * `lookup_bytes` is only used here. Removing `.clone()`. [`fn fetch_raw(...)` accepts `Vec<u8>`](/~https://github.com/paritytech/subxt/blob/f06a95d687605bf826db9d83b2932a73a57b169f/subxt/src/storage/storage_type.rs#L51). `fn build()` already [accepts Url](/~https://github.com/paritytech/jsonrpsee/blob/v0.20.0/client/transport/src/ws/mod.rs#L279) * cleanup * handle all metadata instances * temporary disable unused code on test file * add all-features flag to vault * addresses the confusing `all-features` feature of `runtime` package; Reuse the `UrlParseError` by changing it to `url::ParseError`, instead of jsonrpsee's `InvalidUri` * comment cleanups * use macos image for main workflow testing * revert * print current directory * remove toolchain * ran cargo +nightly-2024-02-09 fmt --all * revert, to use ubuntu * Revert to mac-os This reverts commit 7f25b8f. * uncomment failing test * force install rustup * use macos-13 * ignore subxt issues, and print out proof error * Revert "ignore subxt issues, and print out proof error" This reverts commit 08da9d6. * Revert "Merge branch 'main' into polkadot-v1.1.0" This reverts commit 255845c, reversing changes made to 08da9d6. * ignore failing subxt tests and print out proof error * update cargo lock /~https://github.com/pendulum-chain/spacewalk/actions/runs/10079792308/job/27899770357?pr=536#step:14:24 * fix cargo.lock; update only dia-oracle * update network * add back iowa * [DO NOT MERGE] Test issue on Linux due to update. (#540) * add extra cargo.toml specially for CI testing. * stellar relay config sdftest2 ip change * fixing some regressions from incomplete merge * fixing more merge issues * missing changes from #538 * fix AccountId PrettyPrint impl * fix compare spec version name * fix benchmark changes merge issue * remove when possible explicit declaration of pallets in construct_runtime macro * cargo fmt * remove all-features feature * testing inherent modification * Revert "remove when possible explicit declaration of pallets in construct_runtime macro" This reverts commit 33a6067. * Revert "Revert "remove when possible explicit declaration of pallets in construct_runtime macro"" This reverts commit 17f8100. * regenerate metadata after removal of explicit pallet delcaration in macro * Revert "testing inherent modification" This reverts commit 3345c90. * use pendulum's polkadot-sdk fork with modified constant * modification of stellar sdftest3 ip * modify also vault resources sdftest3 ip --------- Co-authored-by: Marcel Ebert <mail@marcel-ebert.de> Co-authored-by: b-yap <2826165+b-yap@users.noreply.github.com>
This PR is meant to discuss the issue found on some tests in Linux. Not intended to be merged, unless of course a fix can be found.
Tests are
test_register_vaultand
andtest_subxt_processing_events_after_dispatch_error
.Due to the extra log level, it is evident that the transaction get stuck in the node.