Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BTreeMap and potentially other FromIter codec does not have strict one-to-one correspondance #251

Open
sorpaas opened this issue Feb 16, 2021 · 2 comments

Comments

@sorpaas
Copy link
Member

sorpaas commented Feb 16, 2021

I'm not sure whether this would be an issue for SCALE, but for SSZ and RLP, strict one-to-one correspondence is usually required.

Current FromIter codecs (like the implementation for BTreeMap) does not enforce ordering. This means that multiple encodings are representing the same value. This will make code like below unsafe (consider if the definition of Extrinsic has an item of type BTreeMap):

let extrinsic = Extrinsic::decode(&input);
// do something.
validate_extrinsic_against_signature(blake2(extrinsic.encode()), signature);

I'm not sure we need to fix this, and if we need to fix this, how do we deal with backward compatibility (we need to make sure no one ever encountered a many-to-one in a production blockchain outside of runtime). If we decide to leave this behavior as it is, I suggest we put some sort of warnings in the implementations of those FromIter codecs, indicating that code snippets like above may be unsafe.

@bkchr
Copy link
Member

bkchr commented Feb 16, 2021

Some time ago we had a discussion here #77

I see the point, but the encoding of all these types does not enforce an order or similar.

@sorpaas
Copy link
Member Author

sorpaas commented Feb 16, 2021

I see the point, but the encoding of all these types does not enforce an order or similar.

This is indeed what puzzled me at first when I first realized the issue. The thing is, for other similar codec schemas like SSZ and RLP, one-to-one is usually required. So I do want us to have a good conclusion whether all those types which currently does not enforce ordering is okay or not.

The above code snippets I provided (decode a struct and then encode it again) is indeed a real use case that I used many times in the past (for ssz encoding at least). If someone ever did it, that is indeed an "attack vector", so right now it should be avoided. I don't recall using it in Substrate codebase though.

As I remember, many-to-one was one of the reasons why people didn't use protobuf or plain json for blockchain encoding, besides performance, and it was one of the reasons that led to creation of those new blockchain encoding schemas.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants