Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Generic parameters in derive #2811

Open
wants to merge 10 commits into
base: master
Choose a base branch
from

Conversation

mzabaluev
Copy link
Contributor

@mzabaluev mzabaluev commented Nov 10, 2019

Add ability to pass generic parameters of the impl to the derive macros, greatly increasing the flexibility of the derive attribute.

Rendered

@Centril Centril added the T-lang Relevant to the language team, which will review and decide on the RFC. label Nov 11, 2019
@nox
Copy link
Contributor

nox commented Nov 11, 2019

So with this proposal, Clone can never allow opting out from bounds, right? Given new macros need to opt into this new system.

@mzabaluev
Copy link
Contributor Author

mzabaluev commented Nov 11, 2019

@nox The derive macro for Clone will be updated to allow unbounded (or differently bounded) parameters. The first code example of the RFC details how it would look like in code.

The attribute input is parsed as a DelimTokenTree; make the
proposed syntax with where clauses conform to that.
There may be bounds that apply to types other than the
generic parameters, so it makes sense to approve the where clause
in this proposal.

This calls for a struct to organize all the generic-carrying inputs to
`proc_macro_derive_with_generics`, which was mentioned previously as
a question. Name it DeriveGenerics.

The extended derive syntax is now fully specified. It's now clear
to me that adorning each comma-separated item with its own generic
parameters is possible, but
the where clause either needs ugly enclosing delimiters or can
only come last. As it relates to some parameters, make it so
that the where clause can only be appended to a single item.
@nox
Copy link
Contributor

nox commented Nov 12, 2019

@nox The derive macro for Clone will be updated to allow unbounded (or differently bounded) parameters. The first code example of the RFC details how it would look like in code.

Wait, so you write down the names of the type params for which you don't want a trait bound? That seems counter-intuitive. Let's say you have Foo<T, U>, what happens to the U trait bound if you write #[derive(<T> Clone)]?

@nox
Copy link
Contributor

nox commented Nov 12, 2019

And how do you handle trait bounds on field types without repeating those field types?

@Lythenas
Copy link

TL;DR: I think this is a shortcoming of the derive implementation for Clone (and probably others) and not something we should have syntax for.

If I understand correctly the Problem you're trying to solve is that for

struct X<T> {
  x: PhantomData<T>
}

#[derive(Clone)] generates

impl<T> Clone for X
where
    T: Clone
{
  fn clone(&self) -> Self {
    Self {  x: x.clone() }
  }
}

where it should actually generate:

impl<T> Clone for X
where
    PhantomData<T>: Clone // this is implemented for every T
{
  fn clone(&self) -> Self {
    Self { x: x.clone() }
  }
}

I think this is more a problem of how the derive for Clone (and others) is implemented. It should generate the bound from the field types (PhantomData<T>) instead of the type parameters (T). Because it is actually calling .clone() on the type of the field (PhantomData<T>) and not the generic type parameter (T).

If you were to actually ignore the type parameter when deriving Clone what would you call for a field of type T. You still have to produce a value for every field when cloning. Even if it gets erased by the compiler in the case of PhantomData<T>.

It might be useful to have uniform way to give arguments to derives but I think the information you want to provide to the derive implementation is so varied that the way it's currently done is good (custom attributes).

@mzabaluev
Copy link
Contributor Author

mzabaluev commented Nov 12, 2019

@nox

Let's say you have Foo<T, U>, what happens to the U trait bound if you write #[derive(<T> Clone)]?

That won't work; you need to list all of the parameters of the impl item:

#[derive(<T, U: Clone> Clone)]
struct Foo<T, U> {
    // ...
}

The idea is that the derive macro can paste these parameters verbatim after the impl keyword and expect the generated impl to work with the bounds given there. This may be more repetitive than field-specific directive attributes (though purely character for character, this way could still be more concise), but it has a "worse is better" virtue: it does not require from the code authors and readers to apply rules in their heads to figure out what parameter bounds will the generated impl item get.

And how do you handle trait bounds on field types without repeating those field types?

The answer is that you apply bounds to non-parameter types only if you have to (in the where clause, which is now officially part of the proposal). I don't expect this to be needed often. First off, a trait impl for a public type should never leak types of private fields which are not directly parameters. Second, with a typical recursively implemented, derivable trait, all fields must have an impl of this trait for their type, so you only need to fit the bounds that satisfy those field impls, it's not imperative to work the field types themselves into the outer impl's bounds.

@mzabaluev
Copy link
Contributor Author

@Lythenas

I think this is more a problem of how the derive for Clone (and others) is implemented. It should generate the bound from the field types (PhantomData<T>) instead of the type parameters (T). Because it is actually calling .clone() on the type of the field (PhantomData<T>) and not the generic type parameter (T).

I think you are conflating the field type with the generic bound that needs to be satisfied so that .clone() works on that field. derive should not automatically expose field types in signatures of generated impls, certainly not for private fields. But derivation of the minimal necessary bounds that satisfy the field's impl is not something that a proc macro can do in current Rust, it should be left to the macro user to define. After all, the whole feature is meant to reduce boilerplate, but assignment of impl bounds may sometimes be a design decision rather than a completely mechanical task.

Add another magic helper attribute to the state machine future,
to specify failure handling that needs to happen after the first step.
@nox
Copy link
Contributor

nox commented Nov 13, 2019

The answer is that you apply bounds to non-parameter types only if you have to (in the where clause, which is now officially part of the proposal). I don't expect this to be needed often. First off, a trait impl for a public type should never leak types of private fields which are not directly parameters. Second, with a typical recursively implemented, derivable trait, all fields must have an impl of this trait for their type, so you only need to fit the bounds that satisfy those field impls, it's not imperative to work the field types themselves into the outer impl's bounds.

There are many types in Stylo, Servo's styling system, where we can't just have general T: Foo bounds on type params and must put those bounds on field types instead. Repeating the field types and the type parameters for each derive just does not scale.

@mzabaluev
Copy link
Contributor Author

There are many types in Stylo, Servo's styling system, where we can't just have general T: Foo bounds on type params and must put those bounds on field types instead.

I'm interested to look at the examples and the design considerations that led to a proliferation of non-uniform impls for derivable traits.

@nox
Copy link
Contributor

nox commented Nov 13, 2019

I'm interested to look at the examples and the design considerations that led to a proliferation of non-uniform impls for derivable traits.

We have generic types that implement a given trait (let's say ToCss) only for some combination of type parameters, we then reuse one of those types (let's say Position<T>) in another generic structure for which we want to derive ToCss. If the derive system generates a T: ToCss bound, the derived code won't compile.

@mzabaluev
Copy link
Contributor Author

We have generic types that implement a given trait (let's say ToCss) only for some combination of type parameters

That sounds like a major inconvenience for developing any kind of regular trait coverage. But yes, for this kind of design the bounds would need to be verbose.

@nox
Copy link
Contributor

nox commented Nov 13, 2019

That sounds like a major inconvenience for developing any kind of regular trait coverage.

Things are this way because that's the most natural way for them to be, and all of our derive stuff has something similar to derive_field_bounds etc like in the other RFC.

@JelteF
Copy link

JelteF commented Nov 13, 2019

I think this is more a problem of how the derive for Clone (and others) is implemented. It should generate the bound from the field types (PhantomData) instead of the type parameters (T). Because it is actually calling .clone() on the type of the field (PhantomData) and not the generic type parameter (T).

I think this is the only correct and most elegant solution to this problem, not having to enter bounds in the derive itself. This solution was actually implemented in my derive_more crate for the Display derive there: JelteF/derive_more#95 (see issue for more discussion: JelteF/derive_more#93)

I think it might actually be possible to implement this without breaking backwards compatibility for the current derives, because it should only fix derives that are generating invalid code now (I might be missing some cases though, so don't take my word on this).

(As you can see in the issue, the Display derive actually needed other bounds to be specified as well. But that was because the derive can be configured to call any method on the fields, which is not the case for the default derives from std)

@nox
Copy link
Contributor

nox commented Nov 13, 2019

@JelteF If the field types are private and the type for which we derive stuff is public, last I checked it will not pass privacy checks.

@JelteF
Copy link

JelteF commented Nov 13, 2019

@nox changing the outer type to pub in the tests in my library I indeed get the following warning, but no error:

warning: private type `generic::complex_type_field_enumerator::Struct<T>` in public interface (error E0446)
   --> tests/display.rs:283:22
    |
283 |             #[derive(Display)]
    |                      ^^^^^^^
    |
    = note: `#[warn(private_in_public)]` on by default
    = warning: this was previously accepted by the compiler but is being phased out; it will become a hard error in a future release!
    = note: for more information, see issue #34537 </~https://github.com/rust-lang/rust/issues/34537>

Maybe this should be a reason to reconsider phasing this behaviour out (or modifying it so that this type of case is allowed). It seems to me that this should be the way you would want to implement an implementation like this. As far as I can tell there's no reason for this type of trait implementation to be disallowed.

If we wanted #[derive(Clone)] to actually do this we could ignore this warning on the derive (I'll probably do this for derive Display).

@JelteF
Copy link

JelteF commented Nov 13, 2019

Actually, in the tracking issue for that lint there is the following comment that basically describes this exact issue:
rust-lang/rust#34537 (comment)
It also says that that code should be accepted once rust-lang/rust#48054 is implemented, so it sounds like this code should (eventually) pass the new privacy rules too.

@mzabaluev
Copy link
Contributor Author

Private-in-public is only part of why leaking field types in bounds is problematic. Even if the type of a private field is public (as e.g. PhantomData is), leaking it in an impl bound for the containing structure breaks encapsulation, as you can no longer freely change that field's type without affecting the public signature of the impl.

@JelteF
Copy link

JelteF commented Nov 13, 2019

Do you mean that someone can implement Clone for PhantomData. Without implementing clone for TheirType?
And that changing it the field from PhantomData to T would result in API breakage.

@mzabaluev
Copy link
Contributor Author

mzabaluev commented Nov 14, 2019

We have generic types that implement a given trait (let's say ToCss) only for some combination of type parameters

I see only six instances in this list where non-parameter bounds were needed.

Let's pick a simple one:

impl<H, V, NonNegativeLengthPercentage> ToCss for Circle<H, V, NonNegativeLengthPercentage> where
    GenericPosition<H, V>: ToCss,
    NonNegativeLengthPercentage: ToCss + PartialEq,

Why can't there be a generic impl<H: ToCss, V: ToCss> ToCss for GenericPosition<H, V>? Where are the specific impls defined?

@mzabaluev
Copy link
Contributor Author

mzabaluev commented Nov 14, 2019

@JelteF Clone is implemented for PhantomData regardless of the parameter type.

What I mean is that changing this:

use some_public_api::Bar;

pub struct Foo<T> {
    inner: Bar<T>
}

impl<T> Clone for Foo<T> where Bar<T>: Clone {
    // ...
}

to this:

use some_public_api::Baz;

pub struct Foo<T> {
    inner: Baz<T>
}

impl<T> Clone for Foo<T> where Baz<T>: Clone {
    // ...
}

is a breaking change for users of Foo, and bounds like that are best avoided because they expose an implementation detail of Foo.

@mzabaluev
Copy link
Contributor Author

mzabaluev commented Nov 14, 2019

I see only six instances in this list where non-parameter bounds were needed.

As further analyzed in servo/servo#24733, of these only two have a reason necessity, albeit questionable, for having a non-parameter bound, and a third is caused by those two.

@mzabaluev
Copy link
Contributor Author

A legitimate reason for having bounds on public field types might be maintainability: like in the Servo example, you are certain, due to the design constraints on the subject functionality, that a certain struct has to have a field of a certain public type, so it makes sense to define the bounds on the struct's impl through those of the field.

This looks like a design decision to me, and as such, it deserves some more explicit syntax.

@JelteF
Copy link

JelteF commented Nov 14, 2019 via email

@mzabaluev
Copy link
Contributor Author

@nox

and all of our derive stuff has something similar to derive_field_bounds etc like in the other RFC.

It should be noted that any custom helper attributes to further reduce boilerplate, or even a systematic solution like #2353, are not incompatible with this proposal. It only makes them needed in fewer cases.

@mzabaluev
Copy link
Contributor Author

mzabaluev commented Nov 14, 2019

My what I meant with my question was: in what case does it actually break when these bounds change?

Consider a type in another crate that embeds Foo<T> and implements Clone. It has to specify its own bounds for Clone so that the bound on impl<T> Clone for Foo<T> ... is satisfied. Before the change, those bounds have to include Bar<T>: Clone, but nothing required that impl to satisfy Baz<T>: Clone, so after the change, it no longer compiles. Now, the authors of that other crate could have done the footwork and deconstructed the bounds on Bar<T>: Clone, as per their current state in their dependency graph, but that just moves the fragility down to them because they are now implicitly dependent on those deconstructed bounds staying compatible across your changes.

It sounds like on one hand you want the
minimal viable bounds, but on the other you hand you don't.

I normally want minimal necessary bounds that don't expose the internal details of a type.

@JelteF
Copy link

JelteF commented Nov 14, 2019

Consider a type in another crate that embeds Foo<T> and implements Clone. It has to specify its own bounds for Clone so that the bound on impl<T> Clone for Foo<T> ... is satisfied. Before the change, those bounds have to include Bar<T>: Clone, but nothing required that impl to satisfy Baz<T>: Clone, so after the change, it no longer compiles.

Just to be clear, you mean something like this: https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=f13f8658e57d192d3c4e74a4514d3c61

My question is, what bounds on the Clone implementation would you want in this case then? Simply where T: Clone? Because that would be the current behaviour.

@mzabaluev
Copy link
Contributor Author

Just to be clear, you mean something like this: https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=f13f8658e57d192d3c4e74a4514d3c61

I'd prefer a less contrived example because there is no good reason for MyType not to implement Clone here.

@JelteF
Copy link

JelteF commented Nov 14, 2019

I'd prefer a less contrived example because there is no good reason for MyType not to implement Clone here.

This is just your example written from your comment written out as far as I can tell. So if you think it's too contrived (I agree it is), that's because your original example already was. I cannot think of a non contrived example, that's why I was asking you for it.

@mzabaluev
Copy link
Contributor Author

This is just your example written from your comment written out as far as I can tell. So if you think it's too contrived (I agree it is), that's because your original example already was.

My example did not have this irregular trait impl:

impl Clone for Bar<MyType> {
    fn clone(&self) -> Self {
        Self {
            data: MyType{}
        }
    }
}

Sure this forces a bound on Bar<T>: Clone to be able to cover MyType, but is there a non-contrived reason why MyType should not implement Clone itself and therefore be covered by the derived impl<T: Clone> Clone for Bar<T>?

@JelteF
Copy link

JelteF commented Nov 14, 2019 via email

@mzabaluev
Copy link
Contributor Author

@JelteF As far as I understand, your addition is meant to justify a bound on a private field type. Unfortunately, it does not go far enough in showing what real purpose would this design be useful for, because you can just derive Clone for MyType instead: https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=a7c2026817494c6376c47b435cac0650

@jonas-schievink jonas-schievink added the A-derive Deriving related proposals & ideas label Mar 7, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-derive Deriving related proposals & ideas T-lang Relevant to the language team, which will review and decide on the RFC.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants