-
Notifications
You must be signed in to change notification settings - Fork 789
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add runtime-wide try-state
hooks
#4631
base: master
Are you sure you want to change the base?
Conversation
9f658e1
to
26158c9
Compare
bot fmt |
@liamaharon https://gitlab.parity.io/parity/mirrors/polkadot-sdk/-/jobs/6348179 was started for your command Comment |
@liamaharon Command |
13c39a5
to
1551382
Compare
@liamaharon can I get a review from anyone? The PR has now been hanging for quite a while with 0 interactions with anyone from Parity. |
3fbb357
to
47a0b80
Compare
47a0b80
to
5df3125
Compare
@liamaharon can I get anyone to review this PR? In particular, I am faced with a problem: the |
I will try to take a look today. Liam is currently offboarding 😢 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Im not sure if i completely get the architecture here.
Can you please briefly put the new traits into the MR description maybe and what they should do? A graphic can also help, but maybe not worth it if we have to change it.
…tytech#4621) Follow-up to the new `XcmDryRunApi` runtime API introduced in paritytech#3872. Taking an extrinsic means the frontend has to sign first to dry-run and once again to submit. This is bad UX which is solved by taking an `origin` and a `call`. This also has the benefit of being able to dry-run as any account, since it needs no signature. This is a breaking change since I changed `dry_run_extrinsic` to `dry_run_call`, however, this API is still only on testnets. The crates are bumped accordingly. As a part of this PR, I changed the name of the API from `XcmDryRunApi` to just `DryRunApi`, since it can be used for general dry-running :) Step towards paritytech#690. Example of calling the API with PAPI, not the best code, just testing :) ```ts // We just build a call, the arguments make it look very big though. const call = localApi.tx.XcmPallet.transfer_assets({ dest: XcmVersionedLocation.V4({ parents: 0, interior: XcmV4Junctions.X1(XcmV4Junction.Parachain(1000)) }), beneficiary: XcmVersionedLocation.V4({ parents: 0, interior: XcmV4Junctions.X1(XcmV4Junction.AccountId32({ network: undefined, id: Binary.fromBytes(encodeAccount(account.address)) })) }), weight_limit: XcmV3WeightLimit.Unlimited(), assets: XcmVersionedAssets.V4([{ id: { parents: 0, interior: XcmV4Junctions.Here() }, fun: XcmV3MultiassetFungibility.Fungible(1_000_000_000_000n) } ]), fee_asset_item: 0, }); // We call the API passing in a signed origin const result = await localApi.apis.XcmDryRunApi.dry_run_call( WestendRuntimeOriginCaller.system(DispatchRawOrigin.Signed(account.address)), call.decodedCall ); if (result.success && result.value.execution_result.success) { // We find the forwarded XCM we want. The first one going to AssetHub in this case. const xcmsToAssetHub = result.value.forwarded_xcms.find(([location, _]) => ( location.type === "V4" && location.value.parents === 0 && location.value.interior.type === "X1" && location.value.interior.value.type === "Parachain" && location.value.interior.value.value === 1000 ))!; // We can even find the delivery fees for that forwarded XCM. const deliveryFeesQuery = await localApi.apis.XcmPaymentApi.query_delivery_fees(xcmsToAssetHub[0], xcmsToAssetHub[1][0]); if (deliveryFeesQuery.success) { const amount = deliveryFeesQuery.value.type === "V4" && deliveryFeesQuery.value.value[0].fun.type === "Fungible" && deliveryFeesQuery.value.value[0].fun.value.valueOf() || 0n; // We store them in state somewhere. setDeliveryFees(formatAmount(BigInt(amount))); } } ``` --------- Co-authored-by: Bastian Köcher <git@kchr.de>
5df3125
to
56e91c7
Compare
@ggwpez I actually re-checked the description I gave to the PR, and I am not sure what is unclear about it. Could you please tell me what's the part that you think should be improved? I guess what I did not highlight strongly enough, probably, is that this PR does not aim to be a overhaul of the |
unsafe { | ||
CANARY_FLAG = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CANARY_FLAG::set(true);
in combination with parameter_types!
pub static please. Cell would also work, but normally we use parameter types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed in a87c594.
for_tuples!( where #( Tuple: TryStateLogic<BlockNumber> )* ); | ||
fn matches_id(id: &[u8]) -> bool { | ||
for_tuples!( #( if Tuple::matches_id(id) { return true; } )* ); | ||
return false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That means that it returns true when one of the tuple elements returns true? But what if we only want to run that single tuple element instead of the whole tuple?
I think i am missing some critical info about this merge request, which is why i dont get what it was done this way.
Is it not enough to just add a TryState
to the end of Executive
so that it would look like this?
pub struct Executive<
System,
Block,
Context,
UnsignedValidator,
AllPalletsWithSystem,
OnRuntimeUpgrade = (),
RuntimeTryState = (),
>(
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps the context in #235 could help you understand better?
The goal of this PR is to achieve exactly what you are suggesting here. Having a generic to run try-state
checks at the runtime level. The problem with the current structure (as also mentioned in the linked issue), is that the implementation on tuples is coupled with the PalletInfoAccess
trait, as you can see here:
for_tuples!( where #( Tuple: crate::traits::PalletInfoAccess )* ); |
This means that in order to use tuple elements that are not pallet, such as runtime-wide logic, the coupling to such a trait must be removed. And because of some requirements the impl_for_tuples
has about knowing the type and size of the types involved, it was not possible to use anything else than a function that returns true
if the specified identifier matches what was provided by the user in the try-state
command.
So now if you have a tuple of, for instance, (CustomLogic, AllPalletsWithSystem)
, the matching function will return true
if CustomLogic
matches, or if any pallet has the specified ID. It is assumed that no two components, be they pallets or else, have the same ID, as that would mean the latter of them would never be called.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And this goes beyond the AllPalletsWithSystem
usage. If even just CustomLogic
is itself a tuple, then with the current design it is required to implement PalletInfoAccess
, which of course makes no sense for anything else than a pallet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The goal of this PR is to achieve exactly what you are suggesting here. Having a generic to run try-state checks at the runtime level
Then the easiest way is to just make a new trait, since the TryState
trait is aimed at pallets (should probably have been named PalletTryState
, but now we are stuck with this for compatibility reasons).
pub trait RuntimeTryState {
fn try_state(whatever...) -> Result<(), TryRuntimeError>;
}
Anyway, i dont really want to block this approach if it has clear advantages over ^, so going to ask @kianenigma what he thinks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, let's wait for Kian's input, if it comes in a reasonable time 😇
As usual, I do not honestly care about HOW this feature is introduced, as long as it becomes possible. I thought that introducing a minimal set of changes would be the preferred solution. Having a new trait would then means that the AllPalletsWithSystem
implements one trait, and the new generic would implement a different trait, and I tend not to like this idea 😁 Also, I don't understand why a pallet is required to make judgments about the TryState::Select
parameter, which is always ignored anyway in implementations, when all they care is, at most, the BlockNumber
. That design flaw I tried to address here, by removing the second argument for pallets implementations, and leave it only inside the executive try-runtime
features, which is the only place where any logic should look at the logic to filter which tests to run. But this conversation would probably be out of scope for this PR, which aims to introduce the feature breaking as little things as possible. Also because, as Kian mentioned, there might have been already talks as to how overhaul the try-runtime stuff altogether.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We've had a lot of iterations here, thank you both, esp @ntn-x2 for exploring solutions with us.
The systemic mistake I am realizing (noted for future) is that we should have first discussed the "ideal outcome" vs the approach. So let's do that.
what should we expect of TryStateSelect
in the context of this PR? Should we be able to tweak RuntimeTryState
with it?
If yes, the current approach is the right way. If not, Oliver's approach is much simpler and RuntimeTryState
will always be executed.
I am leaning toward not allowing runtime level try-state logic to be configurable with TryStateSelect
.
First reason: This API, as pointed out here, is already somewhat broken, and we are not even sure if it is "enough". In that, the point of it was to regulate how much execution time is given to try-state hooks. But, through experience, we have seen that the issue is that a single pallet has a try-state-bomb that takes forever. In an ideal sense, instead of TryStateSelect
you would pass some Weight
-equivalent amount, and this much weight was used in try-state. With PVM we can do this.
Second reason: If we allow TryStateSelect
to properly configure runtime-try-state, the variants would be a lot. If you want "round-robin pallets + runtime", or "pallets x, y, z and runtime". These are hard to express.
Third Reason: Let's do the simpler approach. IFF someone needs to disable runtime level try-state, we have a blueprint of how to solve it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But, through experience, we have seen that the issue is that a single pallet has a try-state-bomb that takes forever. In an ideal sense, instead of
TryStateSelect
you would pass someWeight
-equivalent amount, and this much weight was used in try-state. With PVM we can do this.
PVM will not come tomorrow and I also don't see why we should need this here. These try-state
hooks are not being run as part of the on chain logic. How long they take is really not important at all. We need to ensure that they finish on time. If there is some kind of "try-state bomb", it probably means that we read too much state. Still the amount of state is not in the terabytes or similar, thus I would say that our current read performance needs to be optimized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kianenigma sorry I think I did not fully understand what your suggestion here is. Are you suggesting to have a new trait which takes only a block number and that is required to be implemented by the new generic? That would mean that the TryStateSelect
logic would only be applied to pallets, while all other runtime try-states would all always be executed. Is that the proposed way forward? Would the runtime logic still be called before the AllPallets
logic?
The CI pipeline was cancelled due to failure one of the required jobs. |
Fixes #235.
I tried different approaches, including the suggestion of adding a
fn id() -> String;
function on thetrait TryState<BlockNumber>
trait, but that was giving me issues when trying to deriveTryState
for tuples (what is theid()
of a tuple of elements such asAllPalletsWithSystem
)? Also, it feels less than ideal to me for a trait to implement both logic that can take aSelect::Only
parameter (on which you should apply filtering logic, that's why it's ignored in 100% of the cases that I found) and also returns anid() -> String
in there.So, I am proposing the following approach:
Executive
takes a new generic with a default()
implementation for customTryState
hooks, as done for theOnRuntimeUpgrade
ones. Like theOnRuntimeUpgrade
hooks,TryState
hooks are executed before pallet-specific hooks.trait TryStateLogic<BlockNumber>
which is responsible for thetry-state
logic itself, and thetrait IdentifiableTryStateLogic<BlockNumber>: TryStateLogic<BlockNumber>
, which returns a function that is used to match the specific test with one of the provided identifiers inTryStateSelect::Only(names)
. The two traits could be collapsed into one, if need be, as long as they expose the two functions declared here.pallet
macro also implements the two new traits, implementingTryStateLogic<BlockNumber>
re-using theHooks<BlockNumber
logic fortry_state()
andIdentifiableTryStateLogic<BlockNumber>
re-using thePalletInfoAccess
logic forname()
.Open points
TryState
trait for the empty tuple()
, the(T,)
tuple and the(T1, T2)
tuple, which covers the(CustomHook, AllPallets*)
case. We might want to have a macro where the wholewhere
clause is configurable, not only additional traits. But it would happen in a different PR.Executive
unit tests to test that theTryState
hooks are called in the right order. BecauseTryState
hooks are not supposed to modify storage, I could not use the same approach used forOnRuntimeUpgrade
hooks. Hence, I used a static variable that is reset tofalse
with a new test ext, and set totrue
when the test customTryState
is invoked. I did not get any errors in the tests, but it could be that this leads to concurrency issues, in which case an alternative strategy is needed, perhaps including an additional boolean flag in what is returned bynew_test_ext
?