Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support getters for already set members #221

Closed
lazkindness opened this issue Nov 27, 2024 · 15 comments · Fixed by #222
Closed

Support getters for already set members #221

lazkindness opened this issue Nov 27, 2024 · 15 comments · Fixed by #222
Assignees

Comments

@lazkindness
Copy link
Contributor

Hi, Bon is my favorite crate and I'm using it as the backbone for a complex pipeline internal framework.

In essence, my pipeline works like this:

        pipeline
            .pass(the_first_step)
            .pass(another_step)
            .pass(some_other_step)
            .pass(the_last_step)

Each step of the pipeline is (simplified) implemented like this:

pub async fn the_first_step<S>(
    Payload(id): Payload,
    PgConn(pg_conn): PgConn,
    State(builder): State<NewDbRowBuilder<S>>,
) -> Result<NewDbRowBuilder<SetTheFirstStepValue<S>>, crate::Error>
where
    S: new_db_row_builder::State,
    S::TheFirstStepValue: IsUnset,
{
   // some work that connects to api, db, etc, and produces a value

    Ok(builder.the_first_step_value(my_awesome_value_i_fetched))
}

The types of the pipeline enforce that the state produced by a prior pass must be fed into the pass under it, so you can't accidentally order anything wrong. I love using Bon to back this pipeline system and allow me to make each pass specify what the conditions of its state should be.

Where I am struggling is that sometimes I run into a weird, limiting situation.

I have one step in my pipeline that scrapes some information off of a user in my DB, processes it, then sets it in the builder (because this value must go in the row that's being built). Then, in another pass of the pipeline, I must read processed data out and use it to compute another value that also goes on my row. Given the current implementation of Bon (while following the rules and not directly implementing unsafe methods on generated code), I have three solutions:

  1. Take this processed value, clone it and set it into the state, then return a tuple of the state and the cloned value. This is bad to me because the passes of my pipeline become much less generic as they now must take a tuple of the built state and any previously obtained values and drill them through the entire way
  2. Do what I'm doing currently, and maintain a cache that passes can use to set values and retrieve them later. This works (and is fairly generic), but I lose the ability to safely do the guards like IsSet because this cache has to be completely detached from bon
  3. Use builder(field) on these fields I want to cache. This also works, but it has the same problem as before where I lose the ability to do guards on the passes which is what I love so much about bon.

I noticed in 3.1.1 you removed the non-determinism of generated privates. After seeing this, being able to add methods that take in &self and return a reference to a value in the builder state would be incredible and make my pipeline work exactly how I want it to. This long winded context finally leads to my question (the one in the title)...

How fundamentally unsafe is this? Taking a look at the expanded macro from builder everything I can see indicates that adding read only accessor methods that do not move out of the tuple or clone the data would be completely safe, but I figured it would be well worth opening an issue for this because this will be running in production code where a mistake like this will not be acceptable.

Thank you!

A note for the community from the maintainers

Please vote on this issue by adding a 👍 reaction to help the maintainers with prioritizing it. You may add a comment describing your real use case related to this issue for us to better understand the problem domain.

@Veetaha
Copy link
Collaborator

Veetaha commented Nov 27, 2024

Hi, thank you for creating the issue! Accessing the private fields of the builder (except for #[builder(field)]) is definitely wrong, since those internals may change between patch versions. The only unsafe code generated by builder macros at the moment is unwrap_unchecked for required members in the finishing function. So the only way this may break is if you set the required field inside the builder to None by directly accessing its internals. Vending a readonly reference to the internal field is completely fine, however, there isn't an official API to do that in the current version of bon (3.1.1).

But.. I was actually planning to add an ability to generate getters for already set members as the next feature after #[builder(field)]. I imagine it like this. There will be a #[builder(getter)] annotation that will generate a simple by-readonly-reference getter with no magic, that has the name get_{member}. It will be possible to override its visibility, docs and name with the usual attributes #[builder(getter(name = ..., vis = ..., doc { ... })]. The default visibility will be the same as it would be for setters.

In future increments, I plan to provide the ability to generate getters for mutable references, the ability to do AsRef and Deref coercions, and custom conversions similar to #[builder(with)], but it'll be #[builder(getter(with))].

@lazkindness
Copy link
Contributor Author

Fantastic to hear! Thank you. I am fairly familiar with bon's internals, and if you haven't started I'd be happy to take on and implement this.

@Veetaha Veetaha changed the title How fundementally unsafe is accessing bon internals to act as a getter? Support getters for already set members Nov 27, 2024
@Veetaha
Copy link
Collaborator

Veetaha commented Nov 27, 2024

Yeah, I've been procrastinating on this a bit given the Stalker 2 release 😳. I haven't started with this feature yet, so feel free to take a stab at this 👍

@lazkindness
Copy link
Contributor Author

Haha, have fun. Feel free to assign me this. Will get started now.

@lazkindness
Copy link
Contributor Author

At a high level, how do you feel about having the getters return an impl Borrow<T> instead of just &T? The main reason I'm suggesting this is because the ergonomics could be a bit better for Strings and Vecs which would then return &str and &[T] accordingly.

@Veetaha
Copy link
Collaborator

Veetaha commented Nov 28, 2024

Honestly, I have mixed feelings about automatic magic conversions.

If you suggest the getter to return literally impl Borrow<T>, then it will be inconvenient for the caller, because the caller will then need to both specify the generic parameter T explicitly, and invoke borrow() on the returned value.

In general, even if we use Deref instead of Borrow in this case, I'm against complex type projections like <Vec<T> as Deref>::Target, because that would appear in type hints and in docs in this complex form. So I suggest users to always specify the target of the deref coercion in deref(...) attribute unless it's a well known type.

Here are some ideas I have for this API here in general to be more specific:

#[builder(
    // For optional members automatically does `Option::as_ref` (reference
    // is moved under the Option).
    //
    // &T | Option<&T>
    getter,

    // For optional members does no conversion, because `&mut Option<_>` makes
    // sense if the caller wants to reset the `Option` to `None`.
    //
    // &mut T | &mut Option<T>
    getter(mut),

    // Deref is only supported for readonly getters. One probably doesn't
    // need deref for mutable getters, because returning e.g. `&mut String`
    // or `&mut Vec<T>` makes perfect sense.
    //
    // The argument to `deref` specifies the return type of the getter
    // (i.e. the target of the deref coercion). However, it can be omitted
    // (i.e. just `getter(deref)`) for some well-known types such as:
    // - String
    // - Vec<T>
    // - Box<T>
    // - Rc<T>
    // - Arc<T>
    // - Cow<T>
    // - PathBuf
    // - OsString
    // - CString
    // - Utf8PathBuf
    // - ...
    // Just look up the `Deref` impls in `std` here: https://doc.rust-lang.org/stable/std/ops/trait.Deref.html,
    // and in `camino` crate, we may extend this list in the future.
    //
    // &str | Option<&str>
    getter(deref(&str)),

    // Useful for primitive types (e.g. `u32`)
    // Getter by `Copy` -> T | Option<T>
    getter(copy),

    // Getter by `Clone` -> T | Option<T>
    getter(clone),

    // Some other standard additional attributes
    getter(
        name = getter_name,
        vis = "",
        docs {
            /// Docs for getter_name
        }
    )

    // Multiple getters need to have name specified explicitly
    getter(name = getter_name_1),
    getter(name = getter_name_2, clone),

    // Custom readonly getter. Accepts a readonly reference and transforms it.
    // Only `&_` type is accepted (_ is replaced with member's type)
    getter = |value: &_| -> Ty { expr }

    // Custom mutable getter. Accepts a mutable reference and transforms it.
    // Only `&mut _` type is accepted (_ is replaced with member's type)
    getter = |value: &mut _| -> Ty { expr }

    // Give a name to the getter if there are several getters
    getter(name = foo, with = |value: &mut _| -> Ty { expr }),
)]

So users would manually need to write #[builder(getter(deref))] for String or Vec<T>.

Feel free to implement just part of this API that would be good enough for the start.

You may also take a look at the getset crate's design, and its github issues where people beg for more flexibility.

@Veetaha
Copy link
Collaborator

Veetaha commented Nov 28, 2024

Just one more remark. I expect implementing mut getters will be a PITA, because mut is a keyword and can't be easily parsed by darling in getter(mut) attr metadata (dtolnay/syn#1458). So I'd definitely postpone that part.

@lazkindness
Copy link
Contributor Author

Got it. How do you feel about this then?

I'll finish up the PR and polish it and get it to support the following:

  1. Standard #[builder(getter)] with default name
  2. name, docs, and vis since most of the code to do that is already there and quite easy

That could probably go in as an experimental / unstable API like overwritable, because I think that those two would cover most people's use cases. And then the rest of the features can be implemented in afterwards for stability?

@lazkindness
Copy link
Contributor Author

As another aside (this will likely be pretty complex to implement correctly too and would break the API), it would be very cool if we could add another state like IsSetSome for option types that gets set when calling the non-maybe version of an Option's builder. That way, we can automatically return &T from an Option<T> builder from a getter if we know that it was set to Some.

@Veetaha
Copy link
Collaborator

Veetaha commented Nov 28, 2024

Got it. How do you feel about this then?

Makes sense to me.

That could probably go in as an experimental

By "that" do you mean the getter attribute as a whole in its initial implementation proposed by you (i.e. builder(getter) / builder(getter(name, vis, doc))? If so, makes sense, maybe the API will change according to more feedback.

could add another state like IsSetSome

Yeah, I had such thing in mind as well even before releasing v3 where the new typestate design landed. However, I wouldn't like to have that state generated by default, because that would complicate things and also probably increase compile times. I was thinking of adding something like that in the future, and make that enabled with some explicit attribute.

@lazkindness
Copy link
Contributor Author

Sweet, just put up the PR: #222

This supports those mentioned configurations.

@Veetaha
Copy link
Collaborator

Veetaha commented Nov 29, 2024

I have some more ideas.

Receiver Support

We should also probably support generating a getter for the receiver (self) of associated methods. Today any #[builder(...)] attributes on the receiver are explicitly prohibited:

if let Some(attr) = builder_attr_on_receiver {
bail!(
attr,
"#[builder] attributes on the receiver are not supported"
);
}

.. and I guess this is the first precedent when we need to support #[builder(getter)] on self. Luckily this doesn't conflict with the current design and can also be added as a future extension just like other getter configs that I proposed above (#221 (comment)).

The question for this use case is how such getter should be named? Should it be get_self? This looks like it may confuse people thinking as if it returns the builder itself (since self of that method is the builder itself). Maybe we should just always require a getter(name = ...) in that case?

@lazkindness
Copy link
Contributor Author

Tbh in this case, I would probably just do get_{name_of_struct_method_is_on}

But at the same time, I don't actually think get_selfis all that bad.

@lazkindness
Copy link
Contributor Author

Another interesting design, I'm wondering how we should support getter with start_fn or finish_fn. IMO it should be supported, but you have to manually specify a name.

@Veetaha
Copy link
Collaborator

Veetaha commented Nov 30, 2024

finish_fn parameters can't have getters because they are passed to the finishing function, that consumes the builder, and they are never stored in the builder. However, start_fn parameters can be supported perfectly fine, they are stored in the builder.

I don't think explicit name is required for start_fn members. The underlying identifier of the start_fn member can be used as the basis of the get_{name} method name

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants