-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Where clauses for more expressive bounds #135
Conversation
The bounds syntax was definitely an issue when working with things like encoders and decoders, it makes things pretty hard to read. The fact that this also allows multiple dispatch is pretty awesome, so +1. The associated type syntax for specifying bounds is a little bit weird, but it makes sense the first time looking at it. I also definitely prefer the Are associated types going to have a separate RFC? |
It seems it wouldn't be that bad to just use the same syntax to both declare type parameters and bounds on arbitrary types. That is:
For the formatting issue one could write it ilke this
Which also allows to reuse types for multiple declarations, like this (this has been suggested in the past):
Instead of "template", one could use "where" or perhaps "type", or even bare angle brackets with no keyword at all, like this:
If nesting is allowed, then there is a slight ambiguity, since something like this:
Could be interpreted as either declaring an additional K parameter, or adding an Eq bound to K. However, hiding an existing parameter doesn't seem very useful, so interpreting this as adding a bound to K should solve the issue. |
@bill-myers that was #122 |
👍 I like this a lot. |
In such cases, the bound will be checked in the callee, not the | ||
caller, and is not included in the list of things that the caller must | ||
prove. Therefore, given the following two functions, an error | ||
is reported in `foo()`, not `bar()`: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why should this even be allowed? Are there any benefits to testing the bounds in the callee, vs just reporting an error that the where
clause does not refer to any type parameters?
On Tue, Jun 24, 2014 at 09:19:39AM -0700, bill-myers wrote:
It is to some extent, yes. I suppose the only thing it rules out are
Speaking personally, I've always found the C++ style confusing because |
Wow that's a tricky multi-dispatch pattern. The original, "more complicated multidispatch proposal" looks a lot cuter without the duplication of the input types, and more generally with putting all the input parameters in a tuple and having only the output parameters show up in the (No comments on the actual |
Yeah, multi-dispatch is complicated, but I'm glad it's possible. And this would presumably open the door to making it simpler to define multi-dispatch methods later (even if it desugars into this complicated version). |
fn reduce<T:Clone>(xs: &[T]) -> T | ||
where () : Add<T,T,T> | ||
// ^~~~~~~~~~~~~~~ Note: DRY | ||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a bit unimportant, but I don't see the advantage with this idiom - is the only benefit that you don't have to write T,T
on the lhs here? It seems like a lot of work to go to for that and the result seems less clear to me. Does it facilitate the next step in some way?
+1 For the proposal in general, looks good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AIUI using ()
like this is just for DRY, because you only have to write the impl on ()
once, but every time you need to bound on Add
you have to write it out, so saying where (L,R) : Add<L,R,S>
is repetitious.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at this a bit more, I'm confused. The trait coherence rules documented above say that you can't implement the same trait twice on a given type, even if the type parameters to the trait differ. But here you're defining multiple Add
implementations on the single type ()
. Isn't that a violation of the same coherence rule that prevented Add<Complex, Complex>
+ Add<int, Complex>
on Complex
?
And if you intended to relax the coherence rules to allow this, then what's the point on defining the trait on (L,R)
to begin with? You could just implement it directly on ()
.
The only difference in the impl on ()
seems to be the fact that it's implemented in terms of type parameters, but I don't see why that should make a difference. And if it is somehow special, then what is the type of the expression <() as Add>::add
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But there's only one impl for ()
, it's just really generic and only does the dispatching to the regular impls that are on various different tuple types and thus coherent.
<() as Add>::add
probably still needs to infer that it's really <() as Add<T, T, T>>::add
(the really-generic impl on ()
), which then calls <(T, T) as Add<T, T, T>>::add
(one of the explicit impls like mpl Add<int,int,int> for (int,int) { ... }
). I think it's really just to avoid spelling out <(T, T) as Add>
, an aid to point inference in the right direction.
This scheme of leaving the dispatch to a helper function/impl also seems to lock out gimmicky impls like Add<T, U, V> for (A, B)
with A, B != T, U
since the ()
impl basically doublechecks that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, so the trait coherence rules actually only care about the number of impl
blocks, not the number of concrete implemented traits? Interesting. And it looks like you're right, I can verify that in a quick test. Still, seems pretty odd. And I still want to know how <() as Add>::add
is supposed to be typed, because <() as Add<T, T, T>>::add
is not a type, it's an expression. I guess it would just be considered an unconstrained type though, preventing it from being inferred properly.
I thought about suggesting the syntax <() as Add<L,R,S>>::add
, which would need to be a modification to the UFCS proposal (I glanced over it, but didn't see that in there, all the examples left the trait unparameterized, as in <() as Add>
). I'm not sure if that's particularly useful though.
In any case, given this rule for trait coherence, it seems like we can skip the tuple stuff altogether. Instead do something like
// in mod ops
mod impl {
pub trait Add<LHS,RHS,Result> {
fn add(lhs: &LHS, rhs: &RHS) -> Result;
}
}
pub trait Add<RHS,Result> {
fn add(&self, rhs: &RHS) -> Result;
}
impl<LHS,RHS,Result> Add<RHS,Result> for LHS
where (LHS,RHS) : impl::Add<LHS,RHS,Result>
{
fn add(&self, rhs: &RHS) -> Result {
<(Self,RHS) as impl::Add>::add(self, rhs)
}
}
With this, you can now just say
use AddImpl = core::ops::impl::Add;
impl AddImpl<Complex, int, Complex> for (Complex, int) { ... }
impl AddImpl<Complex, Complex, Complex> for (Complex, Complex) { ... }
impl AddImpl<int, Complex, Complex> for (int, Complex) { ... }
This looks similar to the Add
impls in the RFC, but by using split traits like this, we can skip the ()
nonsense. Usage would now just look like
fn reduce<T:Clone + Add<T,T>>(xs: &[T]) -> T {
let mut accum = xs[0].clone();
for x in xs.slice_from(1) {
accum = accum.add(&x);
}
}
This also skips the need for a global function. And in fact all existing code that references Add<RHS,Result>
bounds will continue to work. The only change is the implementation of the trait.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah right, the tuple being a reference itself is not right. But yeah, using (&Complex, &int)
would work.
As for referring to the left- or right-hand side individually, I suppose if you actually need to do that, then use the Add<L,R,S>
pattern. But I don't see the benefit to using that for traits that don't need to refer to the LHS or RHS in the trait definition (like Add
). With the approach I outlined, Add<S>
vs Add<L,R,S>
is just an implementation detail, and users always say Add<R>
when using it as a type bound.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With where
clauses, I'm tempted to try to construct some sort of ad-hoc type destructuring device like this:
mod detail {
pub trait IsSameType {}
impl<T> IsSameType for (T, T) {}
}
pub trait Add<Result> {
fn add<T, U>(&T, &U) -> Result
where ((T, U), Self) : detail::IsSameType;
}
but I'm not sure whether it's actually possible to implement that method anymore. ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If where
clauses are extended to support basic ==
comparison of types (which I think will be necessary if we get associated types), then you can do destructuring like
impl<T,L,R> Foo<T>
where T == (L,R)
{ ... }
Of course, this isn't very useful as written, because that's equivalent to just using (L,R)
in place of T
:
impl<L,R> Foo<(L,R)> { ... }
But that's an impl
block. Presumably the need to get at the component types of the tuple is if they're needed in a type signature for one of the trait functions. In that case, if we had a way of introducing type parameters in a trait
block that aren't actually considered parameters to the trait name, that would work. Either of the following two syntaxes could represent that:
pub trait<L,R> Foo<T>
where T == (L,R)
{ ... }
or
pub trait Foo<T>
where <L,R> T == (L,R)
{ ... }
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On Tue, Jun 24, 2014 at 02:28:40PM -0700, Kevin Ballard wrote:
Looking at this a bit more, I'm confused. The trait coherence rules documented above say that you can't implement the same trait twice on a given type, even if the type parameters to the trait differ. But here you're defining multiple
Add
implementations on the single type()
. Isn't that a violation of the same coherence rule that preventedAdd<Complex, Complex>
+Add<int, Complex>
onComplex
?
There is exactly one impl for ()
, so there is no coherence
violation. However, over the last day or so, I did realize that the
proposal doesn't work quite how I described it: I'd have to change the
rules regarding so-called "trivial" obligations that do not involve
type parameters. You still want the caller to verify those
obligations in order to use the
() : Trait<A,B,C>
style. This is because, not knowing the types A and B, the callee
doesn't have enough information to completely resolve the impl. This
is fine. (Simpler, really.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On Tue, Jun 24, 2014 at 03:48:51PM -0700, Kevin Ballard wrote:
Actually even this is unnecessarily repetitious. The
Add
impl trait has no need to be parameterized on the LHS or RHS, as those are both part of the tuple type it's implemented on.
It is true that one could define Add simpler, though it'd be a
semantic change from today since it would have to take its left and
right arguments by value and not by reference (since the argument must
be a (L,R)
tuple, not (&L,&R)
). But I wanted to define a pattern
that scales well to the more general case.
I'm exceptionally wary of having two completely different ways of doing the same thing. Not only will we have to specify all the ways in which they interact with each other, but future enhancements will have to consider which methods will support which features. For instance, what happens with the following? fn foo<T: Eq>(x: T) where T: Ord { ... } I'm curious how onerous it would be to get rid of the old syntax and entirely replace it with And if we do decide to keep both, at what threshold of complexity is one supposed to stop using |
@bstrie With a Threshold of complexity would be up to the opinion of whomever is writing the code. But we already have that issue today, e.g. how complex must a generic type specialization be before you give it a |
👍 By a strange coincidence, yesterday I was thinking of something exactly like this. However, there is still a major problem that is only partially fixed by this. Type parameters are really three things: type variable declarations, type parameters, and type bounds. Right now these are (in most cases) combined into one (with type parameters separate on To remedy this I’d probably make all type declarations implicit, which solves the issue of separating type variable declarations from type parameters: use std::tuple::Tuple2;
trait Add<S>: Tuple2<L, R> {
fn add(left: &L, right: &R) -> S;
}
impl Add<int> for (int, int) {
fn add(left: &int, right: &int) -> int { ... };
}
fn reduce<T: Clone>(xs: &[T]) -> T
where (T, T): Add<T> {
let mut accum = xs[0].clone();
for x in xs.slice_from(1) {
accum = <() as Add>::add(&accum, &x);
}
} (An alternative would be to require explicit declaration of type variables everywhere, resulting in verbosity like This means that there are no hacky workarounds using (The above proposal is probably best suited for an RFC, but would only make sense after this RFC is accepted.) Another thing I’d like to note is that I think the proposed trait Add<L, R, S> {
fn add(left: &L, right: &R) -> S;
}
fn reduce<T: Clone, A: Add<T, T, T>>(xs: &[T]) -> A {
let mut accum = xs[0].clone();
for x in xs.slice_from(1) {
accum = <A as Add>::add(&accum, &x);
}
} |
} | ||
} | ||
|
||
Now the expression `a + b` would be sugar for `Add::add((a, b))`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would also represent a change from pass-by-shared-reference to pass-by-value for operators. Have you given it some thought whether there is a way to support both variants? For some types pass-by-shared-reference might be more efficient. With settling on tuples it seems to be even harder to get the best of both parameter passing styles.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I expect this is because this is just an example of an alternative implementation and is not being proposed as how Add
should actually work. If it were, it would presumably actually end up being Add::add((&a, &b))
@kballard, I don't see this mentioned in the RFC anywhere. I would like to see it explicitly specified. Furthermore, my question was whether we would allow the intermixing of these features. I'd be more inclined to say that if you use a |
@bstrie The RFC does already demonstrate that constraints inside the func foo<T: Clone>(x: T) -> T;
func foo<T>(x: T) -> T where T: Clone;
Why not? And the RFC already contains examples of code that use both forms in the same declaration. The only thing it doesn't demonstrate is putting bounds on the same type in both places, e.g. func foo<T: Clone>(x: T) -> T where T: Hash; and I see no reason to prohibit that. |
My personal aversion to TIMTOWTDI is my rationale for prohibiting it (if indeed we decide not to remove it entirely). But beyond matters of taste, it's also the safer option. We can make rules strict now and loosen them later. Vice versa is not possible. |
Would it be better to just allow overloading on multiple types, and check these functions when called. Consider the error messages& bounds a separate problem, to be solved independantly; You could open up lots of of functionality in the language, and defer improvements to error-mesages for library code. Library code could just stick to using indirection-traits, if you think errors are the most important issue. IMO It's a shame because this is orthogonal to Rusts' core pillars. The problem with C++ is not 'having lots of features' - its features that are half broken so you have to go through contortions to work around them. And the need for backward compatibility means, we can't fix them. Indirection traits or having to destructure 'self' to access lhs/rhs seem to me like contortions resulting from missing overloading. It seems we can overload multiple parameters by building an argument tuple, but it look strange. IMO the problem in C++ is headers, which worsen overloading/templates (i.e. needing the right headers in the right order before something will compile). Given that Rust doesn't have headers, it wouldn't suffer from this problem; When it does come to specifying a bound on a group of functions reliant on multiple types, the concept of a "Self" might get in the way. What if you want to group together 3 types, and some functions use different pairs. (a matrix, a vector, a scalar). How this all works with standalone generic functions is very clear. Traits make it harder. |
On Tue, Jun 24, 2014 at 03:42:19PM -0700, Ben Striegel wrote:
Since as how the old bounds are syntactic sugar for
There is just one set of constraints and they are unioned together, so there is no problem here.
(There are other equivalent ways one could write the declaration as well.)
Yeah, I don't know.
Speaking personally, it'd probably be "any bound more complicated than |
On Fri, Jun 27, 2014 at 05:01:35AM -0700, dobkeratops wrote:
I don't know precisely what you mean, to be honest. I think you mean |
For an outsider it looks like a great proposal. In particular, it would solve the issue discussed in my comment on trait synonyms. From a historical/academic perspective, the first occurence of |
I like the syntax proposed by gashe. |
As far as the actual feature being proposed is concerned, which is The whole discussion about how to encode "multidispatch traits", on the other hand, gives me the heebie jeebies. Compared to the beautifully simple MPTCs of Haskell? Seriously? Instead of using more gunk to accomodate existing conceptual gunk, I would rather go to work on making the existing features simpler, more consistent, and more accommodating. The following is off the top of my head, so the details can probably stand wibbling, but e.g.:
This preserves "there's one way to do it" on the semantic level, avoiding the whole TFs vs FDs debate Haskell has been laboring under (and also avoiding the need to figure out some inevitably-awkward syntax for FDs), and matches intuition, i.e. the things defined by the The
with no funny business required. |
I first encountered the idea of using a tuple for multiple dispatch from PyPy. Here is a link to PyPy's pairtype. |
The following example of yours is unreadable mainly because of the code formatting style you use:
After editing only whitespaces:
+4 lines, but the readability easily wins the trade-off IMO. |
The unreadable version is the recommended style, though. /~https://github.com/rust-lang/rust/wiki/Note-style-guide#function-declarations |
One more thing:
Unless there's an obvious theoretical justification for it to work a certain way (i.e. for logical consistency with other cases), I think the prudent thing to do here, given that the use case is not obvious (and neither, therefore, is how it should work), would be forbid it. Then when someone files a bug report or RFC saying, "I want to be able to write this because of reason", we'll have a better idea of why it should do what, and we can implement it that way. In other words, don't commit to a semantics prematurely. (In Haskell, it checks whether there is an instance in scope at the call site, as for all other constraints. This makes sense for Haskell because you can have orphan instances, but it may not for Rust.) |
@glaebhoerl Why restrict that, though? It seems like it should work just fine without the restriction, and I don't see any benefit to adding that restriction. I could see perhaps a lint that warns you that your where-clause is asserting something about a type that that is not using type parameters, as that may be a mistake, but that's different than outright forbidding it. |
@kballard Because it's not clear whether it should be checked at the callee, the caller, or whatever. If it is clear and there's only one obvious way it could work, then my comment was based on invalid assumptions and should be disregarded. But if there's more than one possible choice, and we arbitrarily choose one of them, and then we later discover a reason for doing it the other way, it's a backwards compatibility break. If we start out not allowing it, then we're free to add it later in either form. (I assumed the choice was arbitrary because the RFC didn't provide any justification for it, but again, that may not be accurate.) |
@glaebhoerl I could not agree with you more about getting rid of Self in traits and what not. My guess to why that hasn't been done long ago is object types. But IMO people will eventually be clamoring for more powerful existentials anyways, and replacing the current syntax with e.g. I am hoping the "tuple trick" was just an example of what could be done with this new syntax to fake mulch-parameter traits as a stop gap, and eventually we will get the real thing, hopefully as @glaebhoerl describes it. Also +1 for disallowing the current bound syntax if this lands. In addition to the other reasons mentioned, If we get HKTs, kind signatures would take that syntax, and we'd have to refactor our codes anyways. Better refactor now pre-1.0 than later. |
@Ericson2314 I was thinking kind signatures would have a prefix form-follows-declaration and vaguely C++-like syntax, e.g. |
This RFC is a great idea, but the existing trait bounds syntax should then be removed. "There should be one and only one obvious way to do it." If you add two ways to do something, then:
|
@Valloric I agree—in addition to your points, removing the existing syntax would make HKT a lot nicer. Right now the colon is inconsistent—in function arguments, it denotes a type, but in type parameters, it denotes trait bounds. The colon is already used to denote the type of a value, so inside type parameters they could also be used to denote the kind of a type: fn f<A: type<*> -> *>(x: A<int>) -> A<uint> { ... } // bikeshed
1i // sample value
fn(int) -> int // sample type (function)
type<*> -> * // sample kind (type function) |
you could say the colon is the type of a type, or a bound. (value bounded as type, type bounded as trait) .. its consistent enough |
At first glance, I really like the ideas put forward by @glaebhoerl in this comment. |
Great proposal, at least in identifying a problem. The bounds syntax is very noisy, and in its position it obscures what the line of code really does at a first glance:
What you want to read easily is: What is the implementing type. Is it implementing a trait and if so, which? The bounds are secondary to this. This is true both in source files as well as in documentation; The docs use the same syntax to display the trait information, see for example the list of trait implementations here Would it be possible to reduce the noise even more, to something like this:
It is much easier to read. |
@bluss That doesn't work because |
…idispatch pattern, as it now seems that multidispatch is frequent enough to merit better support.
@brson -- I think this is ready to merge. |
6357402
to
e0acdf4
Compare
ping @brson |
Merged as RFC 66. Tracking. Discussion |
Add
where
clauses, which provide a more expressive means of specifying trait parameter bounds. Awhere
clause comes after a declaration of a generic item (e.g., an impl or struct definition) and specifies a list of bounds that must be proven once precise values are known for the type parameters in question.Main benefits:
Option<T>
whereT
is a type parameter.Rendered view.