-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Begin documenting initialization checking #2174
Closed
Closed
Changes from 2 commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -1592,7 +1592,49 @@ our example, this would be the case if the type of `list` had been | |
reported by the next portion of the borrowck, described in the next | ||
section. | ||
|
||
### Borrow checker phase 2: reporting errors | ||
### Borrow Checker phase 2: computing initialization | ||
|
||
In addition to propagating borrows, the borrow-checker makes sure that uninitialized values can never be seen. | ||
|
||
To ensure that, we also compute whether each lvalue is potentionally and/or definitely initialized at each point in the CFG. | ||
|
||
We compute that using another standard dataflow computation. Of course, there might be an infinite amount of lvalues (e.g. consider a linked list using `Box`), but the computation can be done in finite time because lvalues form a tree, and the transfer function always affects an **lvalue subtree** at once (where an lvalue `cv` is in the subtree generated by the lvalue `lv` if `lv` is a **prefix** of `cv`). | ||
|
||
The transfer function is as follows: | ||
|
||
- at function entry, the function arguments are definitely initialized, and all other locals are definitely uninitialized | ||
- after a consuming use of the lvalue `lv`, if the type of `lv` does not implement `Copy`, all lvalues in the lvalue subtree of `lv` are definitely uninitialized | ||
- after a `drop(lv)` statement, all lvalues in the lvalue subtree of `lv` are definitely uninitialized | ||
- after an `lv = <rvalue>` statement or an `lv <- <rvalue>` (drop+replace) statement, all lvalues in the lvalue subtree of `lv` are definitely initialized | ||
- after a non-panic return from an `lv = call <fn>(<args>)` statement, all lvalues in the lvalue subtree of `lv` are definitely initialized | ||
|
||
NOTE: it is possible for all the fields of a struct or tuple to be initialized without the struct/tuple itself being initialized, for example: | ||
```Rust | ||
let x: (u32, u32); | ||
x.0 = 4; | ||
x.1 = 5; | ||
|
||
println!("{} {}", x.0, x.1); // ok | ||
|
||
println!("{:?}", x); //~ ERROR `x` is not initialized | ||
|
||
x = (x.0, x.1); | ||
println!("{:?}", x); // ok | ||
``` | ||
|
||
I'm not sure of a nice way to fix this while maintaining that structs with 0 fields can be uninitialized. XXX: PLEASE TRY TO FIND SOMETHING. | ||
|
||
However, in a valid program, if a value behind a pointer or enum is initialized, the pointer or enum discriminant itself is initialized too. This follows because the contents of a pointer/enum can only be written to while the pointer is initialized. | ||
|
||
XXX: formalize interaction of match & enums to make this correct. | ||
|
||
To ensure that the compiler can always keep track of which values are initialized, a consuming use of a non-`Copy` lvalue `lv` cause an error if `lv` has a prefix of one of these forms: | ||
1. A dereference of an `&`-reference or `*`-pointer. | ||
2. An projection of an element from an array (`[T; n]`), except through an array pattern (e.g. `let x = a[4];` is an error, but `let [.., x] = a;` is OK), as the index might not be known at compile-time. | ||
3. Any projection of an element from a slice (`[T]`). This might be relaxed to allow slice patterns if someone figures out how to implement them. | ||
4. A field projection of an ADT with a destructor (both field accesses like `foo.x` and patterns, including enum patterns like `E::A(data)`), as it is not obvious when the destructor should be run. | ||
|
||
### Borrow checker phase 3: reporting errors | ||
|
||
At this point, we have computed which loans are in scope at each | ||
point. Next, we traverse the MIR and identify actions that are illegal | ||
|
@@ -1630,8 +1672,7 @@ There are a few interesting cases to keep in mind: | |
extend the text here to cover `Box`, though some questions arise | ||
around the handling of drop (see the section on drops for details). | ||
|
||
**Accessing an lvalue LV.** When accessing an lvalue LV, there are two | ||
axes to consider: | ||
**Accessing an lvalue LV.** For checking borrows, when accessing an lvalue LV, there are two axes to consider: | ||
|
||
- The access can be SHALLOW or DEEP: | ||
- A *shallow* access means that the immediate fields reached at LV | ||
|
@@ -1661,6 +1702,55 @@ The pseudocode for deciding when an access is legal looks like this: | |
|
||
``` | ||
fn access_legal(lvalue, is_shallow, is_read) { | ||
check_initialized(lvalue, is_shallow); | ||
check_borrows(lvalue, is_shallow, is_read); | ||
check_permission(lvalue, is_read); | ||
check_immutable_binding(lvalue, is_read); | ||
} | ||
|
||
fn check_permission(lvalue, is_read) { | ||
for each prefix of lvalue { | ||
if prefix ~ `*base` && | ||
!is_read && | ||
base is an immutable reference or pointer | ||
{ | ||
// can't mutate through an immutable reference | ||
report_error(); | ||
} | ||
} | ||
} | ||
|
||
fn check_immutable_binding(lvalue, is_read) { | ||
for each shallow prefix of lvalue { | ||
if is_possibly_initialized(lvalue) && | ||
!is_read && | ||
prefix is an immutable binding | ||
{ | ||
// can't re-mutate a non-mutable binding | ||
report_error(); | ||
} | ||
} | ||
} | ||
|
||
fn check_initialized(lvalue, is_shallow) { | ||
if is_shallow { | ||
for each prefix of lvalue { | ||
if prefix ~ `*base` && !is_definitely_initialized(base) { | ||
// can't write through an uninitialized pointer | ||
report_error(); | ||
} | ||
} | ||
} else { | ||
for each child in lvalue_tree(lvalue) { | ||
if !is_definitely_initialized(lvalue) { | ||
// can't use an uninitialized value | ||
report_error(); | ||
} | ||
} | ||
} | ||
} | ||
|
||
fn check_borrows(lvalue, is_shallow, is_read) { | ||
let relevant_borrows = select_relevant_borrows(lvalue, is_shallow); | ||
|
||
for borrow in relevant_borrows { | ||
|
@@ -1674,40 +1764,49 @@ fn access_legal(lvalue, is_shallow, is_read) { | |
} | ||
``` | ||
|
||
As you can see, it works in two steps. First, we enumerate a set of | ||
in-scope borrows that are relevant to `lvalue` -- this set is affected | ||
by whether this is a "shallow" or "deep" action, as will be described | ||
shortly. Then, for each such borrow, we check if it conflicts with the | ||
action (i.e.,, if at least one of them is potentially writing), and, | ||
As you can see, the checking of borrows works in two steps. First, we | ||
enumerate a set of in-scope borrows that are relevant to `lvalue` -- this | ||
set is affected by whether this is a "shallow" or "deep" action, as will | ||
be described shortly. Then, for each such borrow, we check if it conflicts | ||
with the action (i.e.,, if at least one of them is potentially writing), and, | ||
if so, we report an error. | ||
|
||
For **shallow** accesses to the path `lvalue`, we consider borrows relevant | ||
if they meet one of the following criteria: | ||
|
||
- there is a loan for the path `lvalue`; | ||
- so: writing a path like `a.b.c` is illegal if `a.b.c` is borrowed | ||
- there is a loan for some prefix of the path `lvalue`; | ||
- so: writing a path like `a.b.c` is illegal if `a` or `a.b` is borrowed | ||
- `lvalue` is a **shallow prefix** of the loan path | ||
1. the loan path is a prefix of the path `lvalue`; | ||
- so: writing to a path like `a.b.c` is illegal if `a`, `a.b`, or `a.b.c` itself is mutably borrowed | ||
2. the loan path is, or has a **shallow prefix** `<base>.<field1>` that accesses a field of a union, and `lvalue` has a prefix of the form `<base>.<field2>` for a *different* field with the same base union. | ||
- so: if `a.b` is a union with distinct fields `x` and `y`, shallowly accessing a path `a.b.x.c` is illegal if `a.b.y` or `a.b.y.s` is borrowed. There `<base>` is `a.b`, `<field1>` is `y` and `<field2>` is `x`. | ||
- but: unless `a.b.x` is *also* a union, it is legal to access the `a.b.x.c` is *legal* if `a.b.x.d` is borrowed, because the same union field is used in both borrows. | ||
- the prefix of `lvalue` can be an arbitrary prefix - if `a.b.y.s` is borrowed, then it is illegal to shallowly access `*(*a.b.x.c).t`. | ||
3. `lvalue` is a **shallow prefix** of the loan path | ||
- shallow prefixes are found by stripping away fields, but stop at | ||
any dereference | ||
- so: writing a path like `a` is illegal if `a.b` is borrowed | ||
- so: writing to a path like `a` is illegal if `a.b` is borrowed | ||
- but: writing `a` is legal if `*a` is borrowed, whether or not `a` | ||
is a shared or mutable reference | ||
|
||
In addition, if `lvalue` has a prefix that is a dereference `*base_lv`, then the base pointer `base_lv` must be definitely initialized (inductively, checking the shallow prefix should be equivalent) | ||
- so: writing to a path like `(*a).x` is legal only if `a` is initialized, and still legal if `(*a).x` itself is not initialized. | ||
|
||
For **deep** accesses to the path `lvalue`, we consider borrows relevant | ||
if they meet one of the following criteria: | ||
|
||
- there is a loan for the path `lvalue`; | ||
- so: reading a path like `a.b.c` is illegal if `a.b.c` is mutably borrowed | ||
- there is a loan for some prefix of the path `lvalue`; | ||
- so: reading a path like `a.b.c` is illegal if `a` or `a.b` is mutably borrowed | ||
- `lvalue` is a **supporting prefix** of the loan path | ||
1. the loan path is a prefix of the path `lvalue`; | ||
- so: reading a path like `a.b.c` is illegal if `a`, `a.b` or `a.b.c` is mutably borrowed | ||
2. the loan path is, or has a **supporting prefix** `<base>.<field1>` that accesses a field of a union, and `lvalue` has a prefix of the form `<base>.<field2>` for a *different* field with the same base union. | ||
- so: if `a.b` is a union with distinct fields `x` and `y`, deeply accessing a path `a.b.x.c` is illegal if `a.b.y`, `a.b.y.s`, or (in contrast with shallow accesses, and as long as both dereferences are of a `&mut T`) `*(*a.b.y.s).t` is borrowed. There `<base>` is `a.b`, `<field1>` is `y` and `<field2>` is `x`. | ||
3. `lvalue` is a **supporting prefix** of the loan path | ||
- supporting prefixes were defined earlier | ||
- so: reading a path like `a` is illegal if `a.b` is mutably | ||
borrowed, but -- in contrast with shallow accesses -- reading `a` is also | ||
illegal if `*a` is mutably borrowed | ||
|
||
|
||
And the accessed data and all children in its lvalue tree must be definitely initialized. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is this sentence fragment supposed to be connected to the "3." bullet point immediately above, or something higher up? |
||
|
||
In addition, mutable accesses whose **shallow prefix** contains a immutable binding (`let x;`, as opposed to `let mut x;`) are only valid if the local can't possibly be initialized when the accesses occur (because deep writes require the data to be *definitely* initialized, that means that deep writes whose shallow prefix contains a immutable binding can never be valid). | ||
|
||
**Dropping an lvalue LV.** Dropping an lvalue can be treated as a DEEP | ||
WRITE, like a move, but this is overly conservative. The rules here | ||
are under active development, see | ||
|
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the current borrow checker, we do not accept this program. Are there corner cases that I am overlooking here?
Presuming there are not, I think that the current borrow checker basically enforces a rule like "Assigning to an lvalue LV requires that all prefixes of LV are definitely initialized", right?
Maybe we should just adopt this rule for now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is indeed a difference from the current borrow checker. I see no problem with this rule.