Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merging work from yatima-inc/sp-multihash #144

Closed
johnchandlerburnham opened this issue Oct 14, 2021 · 18 comments
Closed

Merging work from yatima-inc/sp-multihash #144

johnchandlerburnham opened this issue Oct 14, 2021 · 18 comments

Comments

@johnchandlerburnham
Copy link

Hi, I talked briefly to @Stebalien about merging some of the work from our /~https://github.com/yatima-inc/sp-multihash no_std fork. Happy to send PRs, but wanted to ask a few questions first:

  1. Const-Generics:sp-multihash is on rust 1.54-nightly, so we were able to adapt @mriise's excellent const generics #116. It looks like you guys are waiting on 🔬 Tracking issue for generic associated types (GAT) rust-lang/rust#44265 hitting stable Rust first before merging const generics #116. To me, the usability regression reported in this comment seems acceptable compared to the usability gains from not having to drag typenums around everywhere. Plus, const-generics also opens up some really exciting refactoring possiblities, like @Stebalien's neat idea for using DSTs to erase the size-info here: Unsized Cids argumentcomputer/sp-cid#5. Since waiting some weeks/months for stable GATs seems not ideal, I'm wondering if we can figure out some way around this:
  • Is the stable requirement an absolute for this lib? Or maybe we could make a "nightly" rust-multihash branch/release so that development here isn't impacted by rustc as much?
  • Alternatively, is there a way to make the above usability regression acceptable to you, using only stable const-generics?
  1. ByteCursor: Our no_std work removes std::io functions for reading/writing in favor of our bytecursor library. This can be added independently of Const-Generics.
  • Is the bytecursor dependency acceptable here? What about sp-std?
  • Do you want to keep the std::io, functions gated behind the std feature via pragmas? If so, should std still be default?
  1. Nix: We've added a Nix build to sp-multihash, and can PR (independently of the above) to add one here (preview: /~https://github.com/yatima-inc/rust-multihash/pull/1/files), if that's of interest.
@mriise
Copy link
Contributor

mriise commented Oct 14, 2021

A small (ish) note for part 1:
since const generic expressions are still incomplete its still unsure exactly how the implementation would work as it still changes somewhat often. GATs are closer to stabilization though as there has been a recent push for GATs.
I still play with things every week or so in hopes of finding a good fit this is the current best, but it still has unanswered questions as to how generic digests interact with the multihash table.

@Stebalien
Copy link
Member

(writing my opinions on this here so they're public)

  1. I'd obviously prefer to just adopt it, but I have little experience using these libraries in practice. But whatever we do, supporting stable rust is a requirement, IMO.
  2. Given that sp-std is basically just a core/std switch, I see no reason not to depend on it.
  3. In terms of ByteCursor, that looks like a better alternative to WIP: allow read/write without std #138, but the standard io traits should still be supported when the "std" feature is enabled. Would it be possible to make ByteCursor implement the standard traits iff the "std" feature is enabled? That would mean we'd only need one function and an if_cfg.

@mriise
Copy link
Contributor

mriise commented Oct 14, 2021

#[cfg(feature = "std")] and #[cfg(not(feature = "std"))] can be placed pretty much anywhere if I understood what 3 was asking

@Stebalien
Copy link
Member

Yep, that's what I was thinking.

@johnchandlerburnham
Copy link
Author

Okay, it's sounding like we should focus on sending PRs for ByteCursor and Nix, and waiting on const-generics. We'll get to work on PRs for those two.

Given that sp-std is basically just a core/std switch, I see no reason not to depend on it.

To clarify the double negative, you mean we should depend on sp-std or we should not, and use core, alloc, etc directly instead?

Would it be possible to make ByteCursor implement the standard traits iff the "std" feature is enabled?

Doing the std::io traits in bytecursor itself is a good idea, I'll look into that.

@Stebalien
Copy link
Member

To clarify the double negative, you mean we should depend on sp-std or we should not, and use core, alloc, etc directly instead?

I mean we can depend on it. I'd slightly prefer to depend on core/alloc directly, but I'm not familiar enough with the ergonomics of either to make a call.

My only real concern here is not forcing downstream crates from using sp_std. But far as I can tell, they can choose to depend on std/core/alloc directly and simply ignore the fact that this crate uses sp, right?

@vmx
Copy link
Member

vmx commented Oct 15, 2021

@johnchandlerburnham first of all. Thanks for opening this issue. I'm currently the maintainer of the library, though sadly spent too little time on it. Last year I've spent a lot of time consolidating existing (forked) rust-multihash implementations into a single one. The reason is, that for me this is the core of open source. Trying to collaborate where possible and building things together.

This means I'd like to keep forking to a minimum where possible. So I'm happy to work together on this. Luckily others already jumped in on this issue, so i sounds like we already have a good plan (thanks @mriise and @Stebalien).

I don't know if it's possible, but perhaps we can port as many things as possible to rust-multihash (which will depend in Rust stable) and sp-multihash is an API compatible (where possible) drop-in replacement that is using const generics and need Rust nightly.

  • Alternatively, is there a way to make the above usability regression acceptable to you, using only stable const-generics?

This depends on the upstream users. When I merged the available forks and the API changed a lot, I tried hard to get agreement from the biggest upstream users I know of (back then /~https://github.com/libp2p/rust-libp2p/ and /~https://github.com/ChainSafe/forest/) that they will be OK to upgrade to that version, so that I don't end up with new forks.

@Stebalien
Copy link
Member

Looking at both forest and rust libp2p, both of them seem to be using fixed sized CIDs anyways.

@Stebalien
Copy link
Member

@johnchandlerburnham have you considered making ByteCursor generic over AsRef<[u8]>? The fact that one needs to allocate a vector to make a byte buffer is a pretty big drawback when trying to avoid allocations.

@mriise
Copy link
Contributor

mriise commented Oct 16, 2021

I think it might be good to use an implementation already made for using read/write in no_std enviornments

this one seems goood and has const generic support

this one is is auto generated

@Stebalien
Copy link
Member

I wonder what other projects are using. Neither of those crates seems to have that many users (although core2 looks like it might have a future), and they're both pretty "heavy" (likely high maintenance).

I'm also seeing https://crates.io/crates/byteio as a more general bytecursor, but it's still not as general as the core io traits.

@samuelburnham
Copy link
Contributor

samuelburnham commented Oct 27, 2021

Hi all, I'm the co-author of the sp-multihash fork. I agree with the above concerns over adding the bytecursor dependency, since the ideal solution would be a drop-in replacement that implements no-std Read and Write traits and requires minimal refactoring.

The core2 crate that @mriise mentioned seems to do this pretty well and is used in projects such as rust-bitcoin, so I used it to make a much simpler no-std I/O fork here, which I'd love to get feedback on and have opened #146 if it's helpful.

@mriise
Copy link
Contributor

mriise commented Oct 27, 2021

@johnchandlerburnham @samuelburnham just curious since I haven't actually needed to use Write for anything, what is the the driving force behind this effort?
If you have a snipet of code of the ideal API, I think that would help me better understand what can be done to improve the situation.

@Stebalien
Copy link
Member

Read/Write traits allow serializing/deserializing to/from a stream (or byte buffer) without additional allocations. That's the main point, at least.

@Stebalien
Copy link
Member

@samuelburnham that approach looks nice and clean. In the future, we could look into depending directly on core2 (it provides an std shim) but making it optional is probably the better choice for now.

@samuelburnham
Copy link
Contributor

@mriise the main use case for Write in Yatima is to serialize language terms and definitions via DAG-CBOR or other codec, and then storing this data on IPFS for reproducible programming. The encoding process in libipld uses Write, so it would be nice to be able to use that when we generate Cids for definitions, as seen here:

/// A map of content ids to defs, with content ids for the def
#[derive(PartialEq, Clone, Debug)]
pub struct Defs {
  pub defs: BTreeMap<Cid, Def>,
  pub names: BTreeMap<Name, Cid>,
}

impl Def {
  /// Creates a def and a corresponding package entry
  pub fn make(pos: Pos, typ_: Term, term: Term) -> (Self, Entry) {
    let (type_anon, type_meta) = typ_.embed();
    let (term_anon, term_meta) = term.embed();
    let ast_cid = term_anon.cid();
    let defn = Entry {
      pos,
      type_anon: type_anon.cid(),
      type_meta,
      term_anon: ast_cid,
      term_meta,
    };
    let def = Def { pos, def_cid: defn.cid(), ast_cid, typ_, term };
    (def, defn)
  }

This all has to be done in no-std because the Yatima interpreter compiles to WASM for usage in the blockchain, embedded, etc.

@samuelburnham
Copy link
Contributor

@Stebalien thanks! I was running into some error conversion issues with depending on core2 directly, but I agree probably better as a next step effort.

@vmx
Copy link
Member

vmx commented Feb 2, 2022

@samuelburnham I think with the 0.16 release we have incorporated everything from your work on sp-multihash. If something is missing, please open an issue or re-open this one.

@vmx vmx closed this as completed Feb 2, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants