Skip to content

Commit

Permalink
Reviews
Browse files Browse the repository at this point in the history
  • Loading branch information
BoxyUwU committed May 31, 2024
1 parent 5d808a5 commit 03aa3f5
Show file tree
Hide file tree
Showing 7 changed files with 31 additions and 26 deletions.
3 changes: 1 addition & 2 deletions src/ty.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,8 +10,7 @@ The `ty` module defines how the Rust compiler represents types internally. It al
When we talk about how rustc represents types, we usually refer to a type called `Ty` . There are
quite a few modules and types for `Ty` in the compiler ([Ty documentation][ty]).

[ty]: https://doc.rust-lang.org/nightly/nightly-rustc/ru
]stc_middle/ty/index.html
[ty]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/ty/index.html

The specific `Ty` we are referring to is [`rustc_middle::ty::Ty`][ty_ty] (and not
[`rustc_hir::Ty`][hir_ty]). The distinction is important, so we will discuss it first before going
Expand Down
5 changes: 2 additions & 3 deletions src/ty_module/binders.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,6 @@

# `Binder` and Higher ranked regions

Sometimes we define generic parmeters not on an item but as part of a type or a where clauses. As an example the type `for<'a> fn(&'a u32)` or the where clause `for<'a> T: Trait<'a>` both introduce a generic lifetime parameter named `'a`. Currently there is no stable syntax for `for<T>` or `for<const N: usize>` but on nightly the `non_lifetime_binders` feature can be used to write where clauses (but not types) using `for<T>`/`for<const N: usize>`.
Sometimes we define generic parmeters not on an item but as part of a type or a where clauses. As an example the type `for<'a> fn(&'a u32)` or the where clause `for<'a> T: Trait<'a>` both introduce a generic lifetime named `'a`. Currently there is no stable syntax for `for<T>` or `for<const N: usize>` but on nightly `feature(non_lifetime_binders)` feature can be used to write where clauses (but not types) using `for<T>`/`for<const N: usize>`.

The `for` is referred to as a "binder" because it brings new names into scope. In rustc we use the `Binder` type to track where these parameters are introduced and what the parameters are (i.e. how many and whether they the parameter is a type/const/region). A type such as `for<'a> fn(&'a u32)` would be
represented in rustc as:
Expand Down Expand Up @@ -44,7 +43,7 @@ Binder(
&[BoundVariarbleKind::Region(...)],
)
```
This would cause all kinds of issues as the region `'^1_0` refers to a binder at a higher level than the outtermost binder i.e. it is an escaping bound var. The `'^1` region (also writeable as `'^0_1`) is also ill formed as the binder it refers to does not introduce a second parameter. Modern day rustc will ICE when constructing this binder due to both of those regions, in the past we would have simply allowed this to work and then ran into issues in other parts of the codebase.
This would cause all kinds of issues as the region `'^1_0` refers to a binder at a higher level than the outermost binder i.e. it is an escaping bound var. The `'^1` region (also writeable as `'^0_1`) is also ill formed as the binder it refers to does not introduce a second parameter. Modern day rustc will ICE when constructing this binder due to both of those regions, in the past we would have simply allowed this to work and then ran into issues in other parts of the codebase.

[`Binder`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/ty/struct.Binder.html
[`BoundVar]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/ty/struct.BoundVar.html
Expand Down
4 changes: 2 additions & 2 deletions src/ty_module/early_binder.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ fn main() {
}
```

When type checking `main` we cannot just naively look at the return type of `foo` and assign the type `T` to the variable `a`, after all the function `main` does not define any generic parameters, `T` is completely meaningless in this context. More generally whenever an item introduces (binds) generic parameters, when accessing types inside the item from outside, the generic parameters must be instantiated with values from the outer item.
When type checking `main` we cannot just naively look at the return type of `foo` and assign the type `T` to the variable `c`, The function `main` does not define any generic parameters, `T` is completely meaningless in this context. More generally whenever an item introduces (binds) generic parameters, when accessing types inside the item from outside, the generic parameters must be instantiated with values from the outer item.

In rustc we track this via the [`EarlyBinder`] type, the return type of `foo` is represented as an `EarlyBinder<Ty>` with the only way to acess `Ty` being to provide arguments for any generic parameters `Ty` might be using. This is implemented via the [`EarlyBinder::instantiate`] method which discharges the binder returning the inner value with all the generic parameters replaced by the provided arguments.

Expand Down Expand Up @@ -70,7 +70,7 @@ impl<T> Trait for Vec<T> {

When constructing a `Ty` to represent the `b` parameter's type we need to get the type of `Self` on the impl that we are inside. This can be acquired by calling the [`type_of`] query with the `impl`'s `DefId`, however, this will return a `EarlyBinder<Ty>` as the impl block binds generic parameters that may have to be discharged if we are outside of the impl.

The `EarlyBinder` type provides an [`instantiate_identity`] function for discharging the binder when you are "already inside of it". Conceptually this discharges the binder by instantiating it with placeholders in the root universe (we will talk about what this means in the next few chapters). In practice though it simply returns the inner value with no modification taking place.
The `EarlyBinder` type provides an [`instantiate_identity`] function for discharging the binder when you are "already inside of it". This is effectively a more performant version of writing `EarlyBinder::instantiate(GenericArgs::identity_for_item(..))`. Conceptually this discharges the binder by instantiating it with placeholders in the root universe (we will talk about what this means in the next few chapters). In practice though it simply returns the inner value with no modification taking place.

[`type_of`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/ty/context/struct.TyCtxt.html#method.type_of
[`instantiate_identity`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/ty/struct.EarlyBinder.html#method.instantiate_identity
18 changes: 10 additions & 8 deletions src/ty_module/generic_arguments.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
# ADTs and Generic Arguments

The term `ADT` stands for "Algebraic data type", in rust this refers to a struct, enum, or union.

## ADTs Representation

Let's consider the example of a type like `MyStruct<u32>`, where `MyStruct` is defined like so:
Expand All @@ -25,14 +27,14 @@ There are two parts:
parameters. In our example, this is the `MyStruct` part *without* the argument `u32`.
(Note that in the HIR, structs, enums and unions are represented differently, but in `ty::Ty`,
they are all represented using `TyKind::Adt`.)
- The [`GenericArgs`][GenericArgs] is a list of values that are to be substituted
- The [`GenericArgs`] is a list of values that are to be substituted
for the generic parameters. In our example of `MyStruct<u32>`, we would end up with a list like
`[u32]`. We’ll dig more into generics and substitutions in a little bit.

[adtdef]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/ty/struct.AdtDef.html
[GenericArgs]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/ty/type.GenericArgs.html
[`GenericArgs`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/ty/type.GenericArgs.html

**`AdtDef` and `DefId`**
### **`AdtDef` and `DefId`**

For every type defined in the source code, there is a unique `DefId` (see [this
chapter](hir.md#identifiers-in-the-hir)). This includes ADTs and generics. In the `MyStruct<T>`
Expand Down Expand Up @@ -65,8 +67,8 @@ struct MyStruct<T> {
// Want to do: MyStruct<A> ==> MyStruct<B>
```

in an example like this, we can subst from `MyStruct<A>` to `MyStruct<B>` (and so on) very cheaply,
by just replacing the one reference to `A` with `B`. But if we eagerly substituted all the fields,
in an example like this, we can instantiate `MyStruct<A>` as `MyStruct<B>` (and so on) very cheaply,
by just replacing the one reference to `A` with `B`. But if we eagerly instantiated all the fields,
that could be a lot more work because we might have to go through all of the fields in the `AdtDef`
and update all of their types.

Expand All @@ -81,7 +83,7 @@ definition of that name, and not carried along “within” the type itself).

Given a generic type `MyType<A, B, …>`, we have to store the list of generic arguments for `MyType`.

In rustc this is done using [GenericArgs]. `GenericArgs` is a thin pointer to a slice of [`GenericArg`] representing a list of generic arguments for a generic item. For example, given a `struct HashMap<K, V>` with two type parameters, `K` and `V`, the `GenericArgs` used to represent the type `HashMap<i32, u32>` would be represented by `&'tcx [tcx.types.i32, tcx.types.u32]`.
In rustc this is done using [`GenericArgs`]. `GenericArgs` is a thin pointer to a slice of [`GenericArg`] representing a list of generic arguments for a generic item. For example, given a `struct HashMap<K, V>` with two type parameters, `K` and `V`, the `GenericArgs` used to represent the type `HashMap<i32, u32>` would be represented by `&'tcx [tcx.types.i32, tcx.types.u32]`.

`GenericArg` is conceptually an `enum` with three variants, one for type arguments, one for const arguments and one for lifetime arguments.
In practice that is actually represented by [`GenericArgKind`] and [`GenericArg`] is a more space efficient version that has a method to
Expand Down Expand Up @@ -110,7 +112,7 @@ fn deal_with_generic_arg<'tcx>(generic_arg: GenericArg<'tcx>) -> GenericArg<'tcx
[list]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/ty/struct.List.html
[`GenericArg`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/ty/struct.GenericArg.html
[`GenericArgKind`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/ty/enum.GenericArgKind.html
[GenericArgs]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/ty/type.GenericArgs.html
[`GenericArgs`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/ty/type.GenericArgs.html

So pulling it all together:

Expand All @@ -123,4 +125,4 @@ For the `MyStruct<U>` written in the `Foo` type alias, we would represent it in

- There would be an `AdtDef` (and corresponding `DefId`) for `MyStruct`.
- There would be a `GenericArgs` containing the list `[GenericArgKind::Type(Ty(u32))]`
- This is one `TyKind::Adt` containing the `AdtDef` of `MyStruct` with the `GenericArgs` above.
- And finally a `TyKind::Adt` with the `AdtDef` and `GenericArgs` listed above.
17 changes: 10 additions & 7 deletions src/ty_module/instantiating_binders.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,16 +16,16 @@ fn main() {
```
In this example we are providing an argument of type `for<'a> fn(&'^0 u32) -> &'^0 u32` to `bar`, we do not want to allow `T` to be inferred to the type `&'^0 u32` as it would be rather nonsensical (and likely unsound if we did not happen to ICE, `main` has no idea what `'a` is so how would the borrow checker handle a borrow with lifetime `'a`).

Unlike `EarlyBinder` we do not instantiate `Binder` with some concrete set of arguments from the user, i.e. `['b, 'static]` as arguments to a `for<'a1, 'a2> fn(&'a1 u32, &'a2 u32)`. Instead we always instantiate the binder with inference variables of placeholders.
Unlike `EarlyBinder` we typically do not instantiate `Binder` with some concrete set of arguments from the user, i.e. `['b, 'static]` as arguments to a `for<'a1, 'a2> fn(&'a1 u32, &'a2 u32)`. Instead we usually instantiate the binder with inference variables or placeholders.

## Instantiating with inference variables

We instantiate binders with inference variables when we are trying to infer a possible instantiation of the binder, i.e. calling higher ranked function pointers or attempting to use a higher ranked where clause to prove some bound (non exhaustive list). For example, given the `higher_ranked_fn_ptr` from the example above, if we were to call it with `&10_u32` we would:
We instantiate binders with inference variables when we are trying to infer a possible instantiation of the binder, e.g. calling higher ranked function pointers or attempting to use a higher ranked where-clause to prove some bound. For example, given the `higher_ranked_fn_ptr` from the example above, if we were to call it with `&10_u32` we would:
- Instantaite the binder with infer vars yielding a signature of `fn(&'?0 u32) -> &'?0 u32)`
- Equate the type of the provided argument `&10_u32` (&'static u32) with the type in the signature, `&'?0 u32`, inferring `'?0 = 'static`
- The provided arguments were correct as we were successfully able to unify the types of the provided arguments with the types of the arguments in fn ptr signature

As another example of instantiating with infer vars, given some `where for<'a> T: Trait<'a>`, if we were attempting to prove that `T: Trait<'static>` holds we would:
As another example of instantiating with infer vars, given some `for<'a> T: Trait<'a>` where-clause, if we were attempting to prove that `T: Trait<'static>` holds we would:
- Instantiate the binder with infer vars yielding a where clause of `T: Trait<'?0>`
- Equate the goal of `T: Trait<'static>` with the instantiated where clause, inferring `'?0 = 'static`
- The goal holds because we were successfully able to unify `T: Trait<'static>` with `T: Trait<'?0>`
Expand All @@ -34,11 +34,11 @@ Instantiating binders with inference variables can be accomplished by using the

## Instantiating with placeholders

Placeholders are very similar to `Ty/ConstKind::Param`/`ReEarlyParam`, they represent some unknown type that is only equal to itself. `Ty`/`Const` and `Region` all have a `Placeholder` variant that is comprised of a `Universe` and a `BoundVar`.
Placeholders are very similar to `Ty/ConstKind::Param`/`ReEarlyParam`, they represent some unknown type that is only equal to itself. `Ty`/`Const` and `Region` all have a [`Placeholder`] variant that is comprised of a [`Universe`] and a [`BoundVar`].

The `Universe` tracks which binder the placeholder originated from, and the `BoundVar` tracks which parameter on said binder that this placeholder corresponds to. Equality of placeholders is determined solely by whether the universes are equal and the `BoundVar`s are equal. See the [chapter on Placeholders and Universes][ch_placeholders_universes] for more information.

When talking with other rustc devs or seeing `Debug` formatted `Ty`/`Const`/`Region`s, `Placeholder` will often be written as `'!UNIVERSE_IDX`. For example given some type `for<'a> fn(&'a u32, for<'b> fn(&'b &'a u32))`, after instantiating both binders (assuming the `Universe` in the current `InferCtxt` was `U0` beforehand), the type of `&'b &'a u32` would be represented as `&'!2_0 &!1_0 u32`.
When talking with other rustc devs or seeing `Debug` formatted `Ty`/`Const`/`Region`s, `Placeholder` will often be written as `'!UNIVERSE_BOUNDVARS`. For example given some type `for<'a> fn(&'a u32, for<'b> fn(&'b &'a u32))`, after instantiating both binders (assuming the `Universe` in the current `InferCtxt` was `U0` beforehand), the type of `&'b &'a u32` would be represented as `&'!2_0 &!1_0 u32`.

When the universe of the placeholder is `0`, it will be entirely omitted from the debug output, i.e. `!0_2` would be printed as `!2`. This rarely happens in practice though as we increase the universe in the `InferCtxt` when instantiating a binder with placeholders so usually the lowest universe placeholders encounterable are ones in `U1`.

Expand Down Expand Up @@ -105,7 +105,7 @@ the `RePlaceholder` for the `'b` parameter is in a higher universe to track the

## Instantiating with `ReLateParam`

As discussed in a previous chapter, `RegionKind` has two variants for representing generic parameters, `ReLateParam` and `ReEarlyParam`. `ReLateParam` is conceptually a `Placeholder` that is always in the root universe (`U0`). It is used when instantiating late bound parameters on functions/closures. It's actual representation is relatively different from both `ReEarlyParam` and `RePlaceholder`:
As discussed in a previous chapter, `RegionKind` has two variants for representing generic parameters, `ReLateParam` and `ReEarlyParam`. `ReLateParam` is conceptually a `Placeholder` that is always in the root universe (`U0`). It is used when instantiating late bound parameters of functions/closures while inside of them. Its actual representation is relatively different from both `ReEarlyParam` and `RePlaceholder`:
- A `DefId` for the item that introduced the late bound generic parameter
- A [`BoundRegionKind`] which either specifies the `DefId` of the generic parameter and its name (via a `Symbol`), or that this placeholder is representing the anonymous lifetime of a `Fn`/`FnMut` closure's self borrow. There is also a variant for `BrAnon` but this is not used for `ReLateParam`.

Expand Down Expand Up @@ -139,4 +139,7 @@ As a concrete example, accessing the signature of a function we are type checkin
[`instantiate_binder_with_fresh_vars`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_trait_selection/infer/struct.InferCtxt.html#method.instantiate_binder_with_fresh_vars
[`InferCtxt`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_trait_selection/infer/struct.InferCtxt.html
[`EarlyBinder`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/ty/struct.EarlyBinder.html
[`Binder`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/ty/type.Binder.html
[`Binder`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/ty/type.Binder.html
[`Placeholder`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/ty/struct.Placeholder.html
[`Universe`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/ty/struct.UniverseIndex.html
[`BoundVar`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/ty/struct.BoundVar.html
5 changes: 2 additions & 3 deletions src/ty_module/param_ty_const_regions.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,3 @@

# Parameter `Ty`/`Const`/`Region`s

When inside of generic items, types can be written that use in scope generic parameters, for example `fn foo<'a, T>(_: &'a Vec<T>)`. In this specific case
Expand Down Expand Up @@ -31,7 +30,7 @@ struct Foo<T>(Vec<T>);
The `Vec<T>` type is represented as `TyKind::Adt(Vec, &[GenericArgKind::Type(Param("T", 0))])`.

The name is somewhat self explanatory, it's the name of the type parameter. The index of the type parameter is an integer indicating
its order in the list of generic parameters in scope (note: this includes parameters defined on items on outter scopes than the item the parameter is defined on). Consider the following examples:
its order in the list of generic parameters in scope (note: this includes parameters defined on items on outer scopes than the item the parameter is defined on). Consider the following examples:

```rust,ignore
struct Foo<A, B> {
Expand All @@ -50,7 +49,7 @@ impl<X, Y> Foo<X, Y> {
}
```

Concretely given the `ty::Generics` for the item the parameter is defined on, if the index is `10` then starting from the root `parent`, it will be the eleventh parameter to be introduced.
Concretely given the `ty::Generics` for the item the parameter is defined on, if the index is `2` then starting from the root `parent`, it will be the third parameter to be introduced. For example in the above example, `Z` has index `2` and is the third generic parameter to be introduced, starting from the `impl` block.

The index fully defines the `Ty` and is the only part of `TyKind::Param` that matters for reasoning about the code we are compiling.

Expand Down
Loading

0 comments on commit 03aa3f5

Please sign in to comment.