Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Staking ledger bonding fixes #3639

Merged
merged 23 commits into from
Mar 14, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
23 commits
Select commit Hold shift + click to select a range
30b00e3
Add more try runtime checks to staking
gpestana Feb 28, 2024
8b49b43
more changes
gpestana Mar 1, 2024
20a87e2
clean up
gpestana Mar 6, 2024
bb837d2
Merge branch 'master' into gpestana/3245ledger_consistency_runtimecheck
gpestana Mar 6, 2024
31cbd74
fixes set controller and batch deprecation when controller is double …
gpestana Mar 11, 2024
6bc85bc
removes unecessary warns
gpestana Mar 11, 2024
16a692e
prevents controller to become a stash
gpestana Mar 13, 2024
47f9e25
warn only try-runtime checks
gpestana Mar 13, 2024
8ed57de
prevents calling bond_extra in inconsistent ledgers
gpestana Mar 13, 2024
410d784
Update substrate/frame/staking/src/mock.rs
gpestana Mar 13, 2024
86b0ad7
adds bad state checks at the staking ledger level; ensures all access…
gpestana Mar 13, 2024
6e16fa4
nit
gpestana Mar 13, 2024
5986a4e
comments nits
gpestana Mar 13, 2024
578a407
add tests for staking ledger fetch in bad state
gpestana Mar 13, 2024
8dd4818
adds prdoc
gpestana Mar 13, 2024
52c648e
fixes prdoc
gpestana Mar 13, 2024
e3e9672
Update substrate/frame/staking/src/ledger.rs
gpestana Mar 13, 2024
541037f
Update substrate/frame/staking/src/ledger.rs
gpestana Mar 13, 2024
0c1bff6
replaces btreemap with btreeset in try runtime checks
gpestana Mar 13, 2024
371b0cf
simplifies getter check
gpestana Mar 13, 2024
4670659
check if bonded controller is the expected
gpestana Mar 13, 2024
649bf59
Merge branch 'master' into gpestana/3245ledger_consistency_runtimecheck
gpestana Mar 14, 2024
0dde68a
Empty-Commit
kianenigma Mar 14, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 19 additions & 0 deletions prdoc/pr_3639.prdoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
title: Prevents staking controllers from becoming stashes of different ledgers; Ensures that no ledger in bad state is mutated.

doc:
- audience: Runtime User
description: |
This PR introduces a fix to the staking logic which prevents an existing controller from bonding as a stash of another ledger, which
lead to staking ledger inconsistencies down the line. In addition, it adds a few (temporary) gates to prevent ledgers that are already
in a bad state from mutating its state.

In summary:
* Checks if stash is already a controller when calling `Call::bond` and fails if that's the case;
* Ensures that all fetching ledgers from storage are done through the `StakingLedger` API;
* Ensures that a `Error::BadState` is returned if the ledger bonding is in a bad state. This prevents bad ledgers from mutating (e.g.
`bond_extra`, `set_controller`, etc) its state and avoid further data inconsistencies.
* Prevents stashes which are controllers or another ledger from calling `set_controller`, since that may lead to a bad state.
* Adds further try-state runtime checks that check if there are ledgers in a bad state based on their bonded metadata.

crates:
- name: pallet-staking
61 changes: 53 additions & 8 deletions substrate/frame/staking/src/ledger.rs
Original file line number Diff line number Diff line change
Expand Up @@ -32,8 +32,8 @@
//! state consistency.
use frame_support::{
defensive,
traits::{LockableCurrency, WithdrawReasons},
defensive, ensure,
traits::{Defensive, LockableCurrency, WithdrawReasons},
};
use sp_staking::StakingAccount;
use sp_std::prelude::*;
Expand Down Expand Up @@ -106,18 +106,39 @@ impl<T: Config> StakingLedger<T> {
/// This getter can be called with either a controller or stash account, provided that the
/// account is properly wrapped in the respective [`StakingAccount`] variant. This is meant to
/// abstract the concept of controller/stash accounts from the caller.
///
/// Returns [`Error::BadState`] when a bond is in "bad state". A bond is in a bad state when a
/// stash has a controller which is bonding a ledger associated with another stash.
pub(crate) fn get(account: StakingAccount<T::AccountId>) -> Result<StakingLedger<T>, Error<T>> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One issue I feel when reading this function, since it handles both reading by controller and stake, it becomes a little harder to reason about which checks we need for each. What do you think about just splitting these flows.

So internally two functions get_by_stash and get_by_controller that this function calls.

Leave that decision to you though. Feel free to merge if you are happy and we probably will do lot more cleanups and refactors once controllers are removed completely.

Copy link
Contributor Author

@gpestana gpestana Mar 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably will do lot more cleanups and refactors once controllers are removed completely.

Yes, I agree with this statement. The idea of this fn was to keep the code refactors to a minimal once the controllers were deprecated. So refactoring the code would basically mean removing the whole StakingAccount machinery and change the logic of the getter (and others) but the API would remain almost unchanged. I will keep this as-is for now and we can remove simplify it once the controller logic is completely removed.

let controller = match account {
StakingAccount::Stash(stash) => <Bonded<T>>::get(stash).ok_or(Error::<T>::NotStash),
StakingAccount::Controller(controller) => Ok(controller),
}?;
let (stash, controller) = match account.clone() {
gpestana marked this conversation as resolved.
Show resolved Hide resolved
StakingAccount::Stash(stash) =>
(stash.clone(), <Bonded<T>>::get(&stash).ok_or(Error::<T>::NotStash)?),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
(stash.clone(), <Bonded<T>>::get(&stash).ok_or(Error::<T>::NotStash)?),
(stash.clone(), <Bonded<T>>::get(&stash).defensive_ok_or(Error::<T>::NotStash)?),

Copy link
Contributor Author

@gpestana gpestana Mar 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be defensive? The StakingLedger::get() is also used in extrinsics like validate, nominate, etc so we may be throwing defensives on user calls, which are expected from a runtime perspective in case users calls with wrong accounts, etc.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggested it to be defensive because if the caller of this function uses StakingAccount::Stash(_), then Bonded storage item MUST exist, and similarly for StakingAccount::Controller(_).

But I see what you mean; if someone calls staking.validate(alice) and alice is not even a staker, then we will fail. Yeah you are right, defensive would produce too many false negatives here.

Copy link
Contributor Author

@gpestana gpestana Mar 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On top of that, the defensives in testing are a bit too strict to work with. There are cases where we test that validate(reaped_stash) and ensure that we get a NotStash error, for example. And with defensive, we need to handle the panic instead of the returned error and we cannot keep doing assertions once the defensive has been thrown.

There are ways to handle this by breaking down and re-org the tests but I've come across this in other instances and wish that it would be easier to assert a defensive and runtime error at the same time and be able to proceed with the test case. I think this could be improved at the FRAME testing level, I'll think more about it.

StakingAccount::Controller(controller) => (
Ledger::<T>::get(&controller)
.map(|l| l.stash)
.ok_or(Error::<T>::NotController)?,
kianenigma marked this conversation as resolved.
Show resolved Hide resolved
controller,
),
};

<Ledger<T>>::get(&controller)
let ledger = <Ledger<T>>::get(&controller)
.map(|mut ledger| {
ledger.controller = Some(controller.clone());
ledger
})
.ok_or(Error::<T>::NotController)
.ok_or(Error::<T>::NotController)?;

// if ledger bond is in a bad state, return error to prevent applying operations that may
// further spoil the ledger's state. A bond is in bad state when the bonded controller is
// associted with a different ledger (i.e. a ledger with a different stash).
//
// See </~https://github.com/paritytech/polkadot-sdk/issues/3245> for more details.
ensure!(
Bonded::<T>::get(&stash) == Some(controller) && ledger.stash == stash,
Error::<T>::BadState
);

Ok(ledger)
}

/// Returns the reward destination of a staking ledger, stored in [`Payee`].
Expand Down Expand Up @@ -201,6 +222,30 @@ impl<T: Config> StakingLedger<T> {
}
}

/// Sets the ledger controller to its stash.
pub(crate) fn set_controller_to_stash(self) -> Result<(), Error<T>> {
let controller = self.controller.as_ref()
.defensive_proof("Ledger's controller field didn't exist. The controller should have been fetched using StakingLedger.")
.ok_or(Error::<T>::NotController)?;

ensure!(self.stash != *controller, Error::<T>::AlreadyPaired);

// check if the ledger's stash is a controller of another ledger.
if let Some(bonded_ledger) = Ledger::<T>::get(&self.stash) {
// there is a ledger bonded by the stash. In this case, the stash of the bonded ledger
// should be the same as the ledger's stash. Otherwise fail to prevent data
// inconsistencies. See </~https://github.com/paritytech/polkadot-sdk/pull/3639> for more
// details.
ensure!(bonded_ledger.stash == self.stash, Error::<T>::BadState);
}
Ank4n marked this conversation as resolved.
Show resolved Hide resolved

<Ledger<T>>::remove(&controller);
<Ledger<T>>::insert(&self.stash, &self);
<Bonded<T>>::insert(&self.stash, &self.stash);

Ok(())
}

/// Clears all data related to a staking ledger and its bond in both [`Ledger`] and [`Bonded`]
/// storage items and updates the stash staking lock.
pub(crate) fn kill(stash: &T::AccountId) -> Result<(), Error<T>> {
Expand Down
51 changes: 51 additions & 0 deletions substrate/frame/staking/src/mock.rs
Original file line number Diff line number Diff line change
Expand Up @@ -786,6 +786,57 @@ pub(crate) fn bond_controller_stash(controller: AccountId, stash: AccountId) ->
Ok(())
}

pub(crate) fn setup_double_bonded_ledgers() {
assert_ok!(Staking::bond(RuntimeOrigin::signed(1), 10, RewardDestination::Staked));
assert_ok!(Staking::bond(RuntimeOrigin::signed(2), 20, RewardDestination::Staked));
assert_ok!(Staking::bond(RuntimeOrigin::signed(3), 20, RewardDestination::Staked));
// not relevant to the test case, but ensures try-runtime checks pass.
[1, 2, 3]
.iter()
.for_each(|s| Payee::<Test>::insert(s, RewardDestination::Staked));

// we want to test the case where a controller can also be a stash of another ledger.
// for that, we change the controller/stash bonding so that:
// * 2 becomes controller of 1.
// * 3 becomes controller of 2.
// * 4 becomes controller of 3.
let ledger_1 = Ledger::<Test>::get(1).unwrap();
let ledger_2 = Ledger::<Test>::get(2).unwrap();
let ledger_3 = Ledger::<Test>::get(3).unwrap();

// 4 becomes controller of 3.
Bonded::<Test>::mutate(3, |controller| *controller = Some(4));
Ledger::<Test>::insert(4, ledger_3);

// 3 becomes controller of 2.
Bonded::<Test>::mutate(2, |controller| *controller = Some(3));
Ledger::<Test>::insert(3, ledger_2);

// 2 becomes controller of 1
Bonded::<Test>::mutate(1, |controller| *controller = Some(2));
Ledger::<Test>::insert(2, ledger_1);
// 1 is not controller anymore.
Ledger::<Test>::remove(1);

// checks. now we have:
// * 3 ledgers
assert_eq!(Ledger::<Test>::iter().count(), 3);
// * stash 1 has controller 2.
assert_eq!(Bonded::<Test>::get(1), Some(2));
assert_eq!(StakingLedger::<Test>::paired_account(StakingAccount::Stash(1)), Some(2));
assert_eq!(Ledger::<Test>::get(2).unwrap().stash, 1);

// * stash 2 has controller 3.
assert_eq!(Bonded::<Test>::get(2), Some(3));
assert_eq!(StakingLedger::<Test>::paired_account(StakingAccount::Stash(2)), Some(3));
assert_eq!(Ledger::<Test>::get(3).unwrap().stash, 2);

// * stash 3 has controller 4.
assert_eq!(Bonded::<Test>::get(3), Some(4));
assert_eq!(StakingLedger::<Test>::paired_account(StakingAccount::Stash(3)), Some(4));
assert_eq!(Ledger::<Test>::get(4).unwrap().stash, 3);
}

#[macro_export]
macro_rules! assert_session_era {
($session:expr, $era:expr) => {
Expand Down
84 changes: 81 additions & 3 deletions substrate/frame/staking/src/pallet/impls.rs
Original file line number Diff line number Diff line change
Expand Up @@ -165,7 +165,8 @@ impl<T: Config> Pallet<T> {
let controller = Self::bonded(&validator_stash).ok_or_else(|| {
Error::<T>::NotStash.with_weight(T::WeightInfo::payout_stakers_alive_staked(0))
})?;
let ledger = <Ledger<T>>::get(&controller).ok_or(Error::<T>::NotController)?;

let ledger = Self::ledger(StakingAccount::Controller(controller))?;
let page = EraInfo::<T>::get_next_claimable_page(era, &validator_stash, &ledger)
.ok_or_else(|| {
Error::<T>::AlreadyClaimed
Expand Down Expand Up @@ -1728,7 +1729,7 @@ impl<T: Config> StakingInterface for Pallet<T> {
) -> Result<bool, DispatchError> {
let ctrl = Self::bonded(&who).ok_or(Error::<T>::NotStash)?;
Self::withdraw_unbonded(RawOrigin::Signed(ctrl.clone()).into(), num_slashing_spans)
.map(|_| !Ledger::<T>::contains_key(&ctrl))
.map(|_| !StakingLedger::<T>::is_bonded(StakingAccount::Controller(ctrl)))
.map_err(|with_post| with_post.error)
}

Expand Down Expand Up @@ -1836,6 +1837,7 @@ impl<T: Config> Pallet<T> {
"VoterList contains non-staker"
);

Self::check_bonded_consistency()?;
Self::check_payees()?;
Self::check_nominators()?;
Self::check_exposures()?;
Expand All @@ -1844,9 +1846,67 @@ impl<T: Config> Pallet<T> {
Self::check_count()
}

/// Invariants:
/// * A controller should not be associated with more than one ledger.
/// * A bonded (stash, controller) pair should have only one associated ledger. I.e. if the
gpestana marked this conversation as resolved.
Show resolved Hide resolved
/// ledger is bonded by stash, the controller account must not bond a different ledger.
/// * A bonded (stash, controller) pair must have an associated ledger.
/// NOTE: these checks result in warnings only. Once
/// </~https://github.com/paritytech/polkadot-sdk/issues/3245> is resolved, turn warns into check
/// failures.
fn check_bonded_consistency() -> Result<(), TryRuntimeError> {
use sp_std::collections::btree_set::BTreeSet;

let mut count_controller_double = 0;
let mut count_double = 0;
let mut count_none = 0;
// sanity check to ensure that each controller in Bonded storage is associated with only one
// ledger.
let mut controllers = BTreeSet::new();

for (stash, controller) in <Bonded<T>>::iter() {
if !controllers.insert(controller.clone()) {
count_controller_double += 1;
}

match (<Ledger<T>>::get(&stash), <Ledger<T>>::get(&controller)) {
(Some(_), Some(_)) =>
// if stash == controller, it means that the ledger has migrated to
// post-controller. If no migration happened, we expect that the (stash,
// controller) pair has only one associated ledger.
if stash != controller {
count_double += 1;
},
(None, None) => {
count_none += 1;
},
_ => {},
};
}

if count_controller_double != 0 {
log!(
warn,
"a controller is associated with more than one ledger ({} occurrences)",
count_controller_double
);
};

if count_double != 0 {
log!(warn, "single tuple of (stash, controller) pair bonds more than one ledger ({} occurrences)", count_double);
}

if count_none != 0 {
log!(warn, "inconsistent bonded state: (stash, controller) pair missing associated ledger ({} occurrences)", count_none);
}

Ok(())
}

/// Invariants:
/// * A bonded ledger should always have an assigned `Payee`.
/// * The number of entries in `Payee` and of bonded staking ledgers *must* match.
/// * The stash account in the ledger must match that of the bonded acount.
fn check_payees() -> Result<(), TryRuntimeError> {
for (stash, _) in Bonded::<T>::iter() {
ensure!(Payee::<T>::get(&stash).is_some(), "bonded ledger does not have payee set");
Expand All @@ -1861,6 +1921,11 @@ impl<T: Config> Pallet<T> {
Ok(())
}

/// Invariants:
/// * Number of voters in `VoterList` match that of the number of Nominators and Validators in
/// the system (validator is both voter and target).
/// * Number of targets in `TargetList` matches the number of validators in the system.
/// * Current validator count is bounded by the election provider's max winners.
fn check_count() -> Result<(), TryRuntimeError> {
ensure!(
<T as Config>::VoterList::count() ==
Expand All @@ -1879,6 +1944,11 @@ impl<T: Config> Pallet<T> {
Ok(())
}

/// Invariants:
/// * `ledger.controller` is not stored in the storage (but populated at retrieval).
/// * Stake consistency: ledger.total == ledger.active + sum(ledger.unlocking).
/// * The controller keyeing the ledger and the ledger stash matches the state of the `Bonded`
/// storage.
fn check_ledgers() -> Result<(), TryRuntimeError> {
Bonded::<T>::iter()
.map(|(stash, ctrl)| {
Expand All @@ -1896,8 +1966,10 @@ impl<T: Config> Pallet<T> {
Ok(())
}

/// Invariants:
/// * For each era exposed validator, check if the exposure total is sane (exposure.total =
/// exposure.own + exposure.own).
fn check_exposures() -> Result<(), TryRuntimeError> {
// a check per validator to ensure the exposure struct is always sane.
let era = Self::active_era().unwrap().index;
ErasStakers::<T>::iter_prefix_values(era)
.map(|expo| {
Expand All @@ -1915,6 +1987,10 @@ impl<T: Config> Pallet<T> {
.collect::<Result<(), TryRuntimeError>>()
}

/// Invariants:
/// * For each paged era exposed validator, check if the exposure total is sane (exposure.total
/// = exposure.own + exposure.own).
/// * Paged exposures metadata (`ErasStakersOverview`) matches the paged exposures state.
fn check_paged_exposures() -> Result<(), TryRuntimeError> {
use sp_staking::PagedExposureMetadata;
use sp_std::collections::btree_map::BTreeMap;
Expand Down Expand Up @@ -1979,6 +2055,8 @@ impl<T: Config> Pallet<T> {
.collect::<Result<(), TryRuntimeError>>()
}

/// Invariants:
/// * Checks that each nominator has its entire stake correctly distributed.
fn check_nominators() -> Result<(), TryRuntimeError> {
// a check per nominator to ensure their entire stake is correctly distributed. Will only
// kick-in if the nomination was submitted before the current era.
Expand Down
28 changes: 15 additions & 13 deletions substrate/frame/staking/src/pallet/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -791,6 +791,8 @@ pub mod pallet {
SnapshotTargetsSizeExceeded { size: u32 },
/// A new force era mode was set.
ForceEra { mode: Forcing },
/// Report of a controller batch deprecation.
ControllerBatchDeprecated { failures: u32 },
}

#[pallet::error]
Expand Down Expand Up @@ -935,6 +937,11 @@ pub mod pallet {
return Err(Error::<T>::AlreadyBonded.into())
}

// An existing controller cannot become a stash.
if StakingLedger::<T>::is_bonded(StakingAccount::Controller(stash.clone())) {
return Err(Error::<T>::AlreadyPaired.into())
}

// Reject a bond which is considered to be _dust_.
if value < T::Currency::minimum_balance() {
return Err(Error::<T>::InsufficientBond.into())
Expand Down Expand Up @@ -975,7 +982,6 @@ pub mod pallet {
#[pallet::compact] max_additional: BalanceOf<T>,
) -> DispatchResult {
let stash = ensure_signed(origin)?;

let mut ledger = Self::ledger(StakingAccount::Stash(stash.clone()))?;

let stash_balance = T::Currency::free_balance(&stash);
Expand Down Expand Up @@ -1332,8 +1338,6 @@ pub mod pallet {
pub fn set_controller(origin: OriginFor<T>) -> DispatchResult {
let stash = ensure_signed(origin)?;

// The bonded map and ledger are mutated directly as this extrinsic is related to a
// (temporary) passive migration.
Self::ledger(StakingAccount::Stash(stash.clone())).map(|ledger| {
let controller = ledger.controller()
.defensive_proof("Ledger's controller field didn't exist. The controller should have been fetched using StakingLedger.")
Expand All @@ -1343,9 +1347,8 @@ pub mod pallet {
// Stash is already its own controller.
return Err(Error::<T>::AlreadyPaired.into())
}
<Ledger<T>>::remove(controller);
<Bonded<T>>::insert(&stash, &stash);
<Ledger<T>>::insert(&stash, ledger);

let _ = ledger.set_controller_to_stash()?;
Ok(())
})?
}
Expand Down Expand Up @@ -1960,7 +1963,7 @@ pub mod pallet {
};

if ledger.stash != *controller && !payee_deprecated {
Some((controller.clone(), ledger))
Some(ledger)
} else {
None
}
Expand All @@ -1969,13 +1972,12 @@ pub mod pallet {
.collect();

// Update unique pairs.
for (controller, ledger) in filtered_batch_with_ledger {
let stash = ledger.stash.clone();

<Bonded<T>>::insert(&stash, &stash);
<Ledger<T>>::remove(controller);
<Ledger<T>>::insert(stash, ledger);
let mut failures = 0;
for ledger in filtered_batch_with_ledger {
let _ = ledger.clone().set_controller_to_stash().map_err(|_| failures += 1);
}
Self::deposit_event(Event::<T>::ControllerBatchDeprecated { failures });

Ok(Some(T::WeightInfo::deprecate_controller_batch(controllers.len() as u32)).into())
}
}
Expand Down
Loading
Loading