-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handle subtyping in inference through obligations #40570
Handle subtyping in inference through obligations #40570
Conversation
r? @arielb1 (rust_highfive has picked a reviewer for you, use r? to override) |
However, with LUB, I think |
Yes, as I wrote.
Yes, as I wrote. I could not, however, find a test case to exercise that, which is partly why I didn't bother to make it work properly. Doesn't seem challenging, though, I think we just want to introduce a fresh, unconstrained variable. |
@arielb1 Sorry if I sounded testy in that last reply; my notes in the comment were a bit vague, and I agree with your clarifications. That said, I wanted to expand on one point:
I think you are correct about the eventual result but I wouldn't expect it to be derived immediately from the LUB computation. Since the last parameter is considered "bivariant" (i.e., derived from constraints on the others), I'd expect that the last parameter simply gets a fresh variable, and that trait selection will constrain it (ultimately) to be However, it occurs to me that I'm not entirely sure where these WF requirements would get enforced. Much of the code assumes that the resulting types from LUB operations etc are well-formed, and hence I'm not sure when we would add such an obligation. (One possibility is to have the LUB/GLB operation itself produce an obligation, at least if bivariance is involved, given that this is now a possibility.) Can you think of a test case that would exercise these paths? I wasn't able to. Given that I can't produce a bug, I'm tempted to leave this to a FIXME -- but I definitely think an issue is warranted, if nothing else. |
Ah, so I realized that maybe the actual example you gave would work for making such a test case. =) I'll try that out probably on Monday. I was more experimenting with types like |
My continued attempts to craft such a test case have failed. Here is my latest iteration (it is accepted by both this branch and master): #![allow(dead_code)]
use std::cell::Cell;
use std::marker::PhantomData;
struct Foo<F, A, R>
where F: Fn(A) -> R
{
f: F,
d: PhantomData<fn(A)>,
}
macro_rules! foo_ty {
($a:ty, $b:ty) => {
Foo<fn($a) -> $b, $a, $b>
}
}
trait Extract {
type Out;
fn create(self) -> Self::Out;
}
impl<F, A, R> Extract for Foo<F, A, R>
where F: Fn(A) -> R
{
type Out = R;
fn create(self) -> Self::Out { panic!() }
}
fn foo<'a, 'b, 'small, 'big>(a: foo_ty!(&'a (), &'a ()),
b: foo_ty!(&'b (), &'b ()))
-> Cell<&'small ()>
where 'a: 'small, 'b: 'small, 'big: 'a, 'big: 'b,
{
let x = match 22 { 0 => a, 1 => a, 2 => a, _ => b };
let y = Cell::new(Extract::create(x));
y
}
fn main() {
} You can see my various attempts to squash coercion and introduce invariance. I have to comb through the debug logs some more, I was doing this the other day and hence the results are out of cache for me. (I had a bunch of other iterations of course.) |
src/librustc/infer/combine.rs
Outdated
@@ -224,16 +230,15 @@ impl<'infcx, 'gcx, 'tcx> CombineFields<'infcx, 'gcx, 'tcx> { | |||
// Generalize type if necessary. | |||
let generalized_ty = match dir { | |||
EqTo => self.generalize(a_ty, b_vid, false), | |||
BiTo | SupertypeOf | SubtypeOf => self.generalize(a_ty, b_vid, true), | |||
SupertypeOf | SubtypeOf => self.generalize(a_ty, b_vid, true), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add a FIXME(#18653)
to generalize?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is not needed anymore.
The PR looks nice, except I'm also not that sure about the bivariant bit. Not that I think this should block the PR. Also, there are a few |
☔ The latest upstream changes (presumably #40224) made this pull request unmergeable. Please resolve the merge conflicts. |
8174541
to
fc8fc87
Compare
src/libcore/num/flt2dec/mod.rs
Outdated
@@ -384,8 +384,8 @@ pub enum Sign { | |||
/// It can be either `b""`, `b"+"` or `b"-"`. | |||
fn determine_sign(sign: Sign, decoded: &FullDecoded, negative: bool) -> &'static [u8] { | |||
match (*decoded, sign) { | |||
(FullDecoded::Nan, _) => b"", | |||
(FullDecoded::Zero, Sign::Minus) => b"", | |||
(FullDecoded::Nan, _) => b"XXX", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
?
07a11ae
to
ac90cec
Compare
@arielb1 so this now fixes #18653 (and #40951). The main commit is "generalize type variables too"; the commit message there detail some of the non-obvious interactions. I did this work atop this branch because it made it easier to handle the |
ac90cec
to
874c7bc
Compare
This just limits ourselves to the "old school" defaults: diverging variables and integer variables.
There is one fishy part of these changes: when computing the LUB/GLB of a "bivariant" type parameter, I currently return the `a` value. Bivariant type parameters are only allowed in a very particular situation, where the type parameter is only used as an associated type output, like this: ```rust pub struct Foo<A, B> where A: Fn() -> B { data: A } ``` In principle, if one had `T=Foo<A, &'a u32>` and `U=Foo<A, &'b u32>` and (e.g.) `A: for<'a> Fn() -> &'a u32`, then I think that computing the LUB of `T` and `U` might do the wrong thing. Probably the right behavior is just to create a fresh type variable. However, that particular example would not compile (because the where-clause is illegal; `'a` does not appear in any input type). I was not able to make an example that *would* compile and demonstrate this shortcoming, and handling the LUB/GLB was mildly inconvenient, so I left it as is. I am considering whether to revisit this.
874c7bc
to
392c1e3
Compare
The most interesting place is the hinting mechanism; once we start having subtyping obligations, it's important to see those through.
In some cases, we give multiple primary spans, in which case we would report one `//~` annotation per primary span. That was very confusing because these things are reported to the user as a single error. UI tests would be better here.
In some specific cases, the new scheme was failing to learn as much from a LUB/GLB operaiton as the old code, which caused coercion to go awry. A slight ordering hack fixes this.
When we are generalizing a super/sub-type, we have to replace type variables with a fresh variable (and not just region variables). So if we know that `Box<?T> <: ?U`, for example, we instantiate `?U` with `Box<?V>` and then relate `Box<?T>` to `Box<?V>` (and hence require that `?T <: ?V`). This change has some complex interactions, however: First, the occurs check must be updated to detect constraints like `?T <: ?U` and `?U <: Box<?T>`. If we're not careful, we'll create a never-ending sequence of new variables. To address this, we add a second unification set into `type_variables` that tracks type variables related through **either** equality **or** subtyping, and use that during the occurs-check. Second, the "fudge regions if ok" code was expecting no new type variables to be created. It must be updated to create new type variables outside of the probe. This is relatively straight-forward under the new scheme, since type variables are now independent from one another, and any relations are moderated by pending subtype obliations and so forth. This part would be tricky to backport though. cc rust-lang#18653 cc rust-lang#40951
For the most part, it seems to be better, but one side-effect is that I cannot seem to reproduce E0102 anymore.
These are not user expressible anyhow.
392c1e3
to
7832db8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is basically ok mod. the stack nit. How important do you think is the instability thing?
src/librustc/infer/combine.rs
Outdated
@@ -182,6 +181,8 @@ impl<'infcx, 'gcx, 'tcx> CombineFields<'infcx, 'gcx, 'tcx> { | |||
a_is_expected: bool) | |||
-> RelateResult<'tcx, ()> | |||
{ | |||
use self::RelationDir::*; | |||
|
|||
// We use SmallVector here instead of Vec because this code is hot and | |||
// it's rare that the stack length exceeds 1. | |||
let mut stack = SmallVector::new(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: do we even need the stack and the loop here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, yeah, I think I meant to remove that! Thanks.
fn fold_ty(&mut self, ty: Ty<'tcx>) -> Ty<'tcx> { | ||
match ty.sty { | ||
ty::TyInfer(ty::InferTy::TyVar(vid)) => { | ||
match self.type_variables.get(&vid) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this function is already a total hack, but I'm afraid this makes behavior depend on the mood of the union-find algorithm: if a set of (possibly-preexisting) type variables is equated and returned, we'll return one of them "at random".
For example, in
use std::mem;
fn main() {
let v = Vec::new();
fn example<'a, U>(x: Option<&'a mut &[u8]>, u: U) -> &'a mut U { loop {} }
let mut x: Option<&mut $0> = None;
let mut y = None;
mem::swap(&mut x, &mut y); // equate types, so `y: Option<&mut $0>`.
*x.unwrap() = *example::<$1>(y, &v);
}
When this is checked, EIfEO unifies $0
and $1
within the snapshot (a unification that is then discarded), and checks u
with either of the type variables as the expected var, depending on what union-find picks as the root. $0
is unified later on with &[u8]
, so if $0
is picked then the coercion from &Vec<u8>
to &[u8]
occurs before the call to example
, and if $1
is picked the coercion occurs after the call.
This might lead to instability when union-find picks different roots. Not sure how much important is this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, yes, I don't relish the thought of coercions happening or not happening based on very small changes. I was hoping that we could do away with this function and just sometimes unify the return type before the arguments, but that's not precisely equivalent to what it does today. (In particular, the would force the return type to be a subtype of the output type, whereas this generates an expected type based on the assumption that it ought to be so, but doesn't add a hard constraint.)
An obvious option is not to consider the "roots" of variables -- basically any appearance of some variable ?T
that was introduced during the snapshot would be mapped to a fresh variable outside the snapshot. I'm not sure just why I didn't do this; it seems like it ought to be harmless.
At the time when I wrote this code, I was nervous about "losing" the connection, but having thought more about it I think it should be safe for us to play more-or-less arbitrary games with the expected type without (at least) compromising soundness. (This function mildly gives me the willies: e.g., it supplies an "expected-has-type" hint to the arguments, but that's stronger than is needed, etc.)
I think in my ideal world the "expected type" would be allowed to have holes (rather than fresh type variables per se), that have no effect during unification, and we would then translate all unbound type variables to these holes. But doing that would be a big refactor and of dubious (or maybe even negative) value anyway.
Anyway, so long story short, what I could do is to try just replacing all newly created, unbound type variables with newly created, unbound type variables (ignoring the union-find effects). This would probably be backportable too, I imagine. We would lose the "subtype connection" in some cases for these variables, but that shouldn't matter, since this just affects the expected type, and the subtypes should be re-established when the ultimate type is produced.
@@ -755,6 +755,9 @@ pub enum Predicate<'tcx> { | |||
/// for some substitutions `...` and T being a closure type. | |||
/// Satisfied (or refuted) once we know the closure's kind. | |||
ClosureKind(DefId, ClosureKind), | |||
|
|||
/// `T1 <: T2` | |||
Subtype(PolySubtypePredicate<'tcx>), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can subtype predicates ever be higher-ranked (aka for<'a> &'a u32 <: &'b u32
) ? I don't think so after this patch. Are you trying to support that in the future?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, perhaps not. I was thinking that in the future a case like for<'a> fn(?T) <: for<'a> fn(?U)
might cause such a predicate, but that's not really true (the ?T
and ?U
are bound outside the for
). It would be possible to have higher-ranked subregion constraints, in contrast, but those are a separate mechanism.
@arielb1 removed the stack, simplified |
src/librustc/infer/type_variable.rs
Outdated
pub root_vid: ty::TyVid, | ||
pub root_origin: TypeVariableOrigin, | ||
} | ||
pub type TypeVariableMap = FxHashMap<ty::TyVid, TypeVariableOrigin>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is actually OK.FxHashMap
indexed by integers?
@bors r+ |
📌 Commit 1cc7621 has been approved by |
…ion, r=arielb1 Handle subtyping in inference through obligations We currently store subtyping relations in the `TypeVariables` structure as a kind of special case. This branch uses normal obligations to propagate subtyping, thus converting our inference variables into normal fallback. It also does a few other things: - Removes the (unstable, outdated) support for custom type inference fallback. - It's not clear how we want this to work, but we know that we don't want it to work the way it currently does. - The existing support was also just getting in my way. - Fixes #30225, which was caused by the trait caching code pretending type variables were normal unification variables, when indeed they were not (but now are). There is one fishy part of these changes: when computing the LUB/GLB of a "bivariant" type parameter, I currently return the `a` value. Bivariant type parameters are only allowed in a very particular situation, where the type parameter is only used as an associated type output, like this: ```rust pub struct Foo<A, B> where A: Fn() -> B { data: A } ``` In principle, if one had `T=Foo<A, &'a u32>` and `U=Foo<A, &'b u32>` and (e.g.) `A: for<'a> Fn() -> &'a u32`, then I think that computing the LUB of `T` and `U` might do the wrong thing. Probably the right behavior is just to create a fresh type variable. However, that particular example would not compile (because the where-clause is illegal; `'a` does not appear in any input type). I was not able to make an example that *would* compile and demonstrate this shortcoming, and handling the LUB/GLB was mildly inconvenient, so I left it as is. I am considering whether to revisit this or what. I have started a crater run to test the impact of these changes.
☀️ Test successful - status-appveyor, status-travis |
This is moderately likely to cause regressions. Relnoting. |
We currently store subtyping relations in the
TypeVariables
structure as a kind of special case. This branch uses normal obligations to propagate subtyping, thus converting our inference variables into normal fallback. It also does a few other things:There is one fishy part of these changes: when computing the LUB/GLB of a "bivariant" type parameter, I currently return the
a
value. Bivariant type parameters are only allowed in a very particular situation, where the type parameter is only used as an associated type output, like this:In principle, if one had
T=Foo<A, &'a u32>
andU=Foo<A, &'b u32>
and (e.g.)A: for<'a> Fn() -> &'a u32
, then I think that computing the LUB ofT
andU
might do the wrong thing. Probably the right behavior is just to create a fresh type variable. However, that particular example would not compile (because the where-clause is illegal;'a
does not appear in any input type). I was not able to make an example that would compile and demonstrate this shortcoming, and handling the LUB/GLB was mildly inconvenient, so I left it as is. I am considering whether to revisit this or what.I have started a crater run to test the impact of these changes.