New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Only consider yields coming after the expressions when computing generator interiors #44392

Merged

bors merged 8 commits into rust-lang:master from Zoxc:yield-order

Sep 21, 2017

Contributor

Zoxc commented Sep 7, 2017

When looking at the scopes which temporaries of expressions can live for during computation of generator interiors, only consider yields which appear after the expression in question in the HIR.

rust-highfive assigned nikomatsakis

Collaborator

rust-highfive commented Sep 7, 2017

r? @nikomatsakis

(rust_highfive has picked a reviewer for you, use r? to override)

nikomatsakis reviewed

View reviewed changes

src/librustc_typeck/check/generator_interior.rs Outdated

    
              }

              impl<'a, 'gcx, 'tcx> InteriorVisitor<'a, 'gcx, 'tcx> {

                  fn record(&mut self, ty: Ty<'tcx>, scope: Option<region::Scope>, expr: Option<&'tcx Expr>) {

                      use syntax_pos::DUMMY_SP;

                      if self.fcx.tcx.sess.verbose() {

                      let span = scope.map_or(DUMMY_SP, |s| s.span(self.fcx.tcx, &self.region_scope_tree));

Contributor

nikomatsakis Sep 7, 2017

Nit: indentation

nikomatsakis reviewed

View reviewed changes

src/librustc_typeck/check/generator_interior.rs Outdated

    
              }

              impl<'a, 'gcx, 'tcx> InteriorVisitor<'a, 'gcx, 'tcx> {

                  fn record(&mut self, ty: Ty<'tcx>, scope: Option<region::Scope>, expr: Option<&'tcx Expr>) {

                      use syntax_pos::DUMMY_SP;

                      if self.fcx.tcx.sess.verbose() {

                      let span = scope.map_or(DUMMY_SP, |s| s.span(self.fcx.tcx, &self.region_scope_tree));

                      self.fcx.tcx.sess.span_warn(span, &format!("temporary scope for node id {:?}", expr));

Contributor

nikomatsakis Sep 7, 2017

Also, this is not what verbose is meant to be used for. Can you just remove this, or else use debug!?

nikomatsakis reviewed

View reviewed changes

src/librustc_typeck/check/generator_interior.rs Outdated

    
                          self.region_scope_tree.yield_in_scope(s)

                          self.region_scope_tree.yield_in_scope(s).and_then(|(span, expr_count)| {

                              // Check if the span in the region comes after the expression

                              if expr_count > self.expr_count {

Contributor

nikomatsakis Sep 7, 2017

This makes me pretty uncomfortable. It's unclear that a "count" of the number of expressions is a good way of judging control-flow. I'd feel better if we used just the statements of a block: this would make simple stuff like

fn bar(baz: String) -> impl Generator<Yield=(), Return=()> {
    move || {
        yield drop(&baz);
    }
}

work fine, but wouldn't get the 'intra-statement' ordering. This seems .. kind of ok? Or do you think it's not good enough?

Contributor Author

Zoxc Sep 8, 2017

I suspect we'd want to be more flexible here. await! will probably be nested inside expressions, unlike plain yield which doesn't have a useful result.

arielb1 reviewed

View reviewed changes

src/librustc_typeck/check/generator_interior.rs

    
              impl<'a, 'gcx, 'tcx> Visitor<'tcx> for InteriorVisitor<'a, 'gcx, 'tcx> {

                  fn nested_visit_map<'this>(&'this mut self) -> NestedVisitorMap<'this, 'tcx> {

                      NestedVisitorMap::None

                  }

                  fn visit_body(&mut self, _body: &'tcx Body) {

Contributor

arielb1 Sep 7, 2017

Why are you recurring into closures? Can't you consider each closure separately?

Contributor Author

Zoxc Sep 8, 2017

I realized that closures / generators aren't visited if nested_visit_map returns None, so I just removed this code.

arielb1 reviewed

View reviewed changes

src/librustc_typeck/check/generator_interior.rs Outdated

    
              }

              impl<'a, 'gcx, 'tcx> InteriorVisitor<'a, 'gcx, 'tcx> {

                  fn record(&mut self, ty: Ty<'tcx>, scope: Option<region::Scope>, expr: Option<&'tcx Expr>) {

                      use syntax_pos::DUMMY_SP;

                      if self.fcx.tcx.sess.verbose() {

                      let span = scope.map_or(DUMMY_SP, |s| s.span(self.fcx.tcx, &self.region_scope_tree));

                      self.fcx.tcx.sess.span_warn(span, &format!("temporary scope for node id {:?}", expr));

Contributor

arielb1 Sep 7, 2017

indentation

arielb1 reviewed

View reviewed changes

src/librustc_typeck/check/generator_interior.rs Outdated

    
                      self.expr_count += 1;

                      if self.fcx.tcx.sess.verbose() {

                      self.fcx.tcx.sess.span_warn(expr.span, &format!("node id {:?}", expr.id));

Contributor

arielb1 Sep 7, 2017

indentation

arielb1 reviewed

View reviewed changes

src/librustc_typeck/check/generator_interior.rs Outdated

		let scope = self.region_scope_tree.temporary_scope(expr.hir_id.local_id);

Contributor

arielb1 Sep 7, 2017

odd whitespace

arielb1 reviewed

View reviewed changes

src/test/run-pass/generator/borrow-in-yield-expr.rs Outdated

    
              fn bar(baz: String) -> impl Generator<Yield=(), Return=()> {

                  move || {

                      yield drop(&baz);

Contributor

arielb1 Sep 7, 2017

I'm afraid you can do

struct DropBomb;
impl Drop for DropBomb {
    fn drop(&mut self) {
        println!("splat!");
    }
}

fn bar() -> impl Generator<Yield=(), Return=()> {
    move || {
        yield &DropBomb;
    }
}

Because yield is currently not a terminating scope, this would only get dropped after the generator returns. If DropBomb was e.g. !Send, this can obviously cause trouble.

This can be solved by making yield a terminating scope, which would force temporaries in all of its arguments to be destroyed before it is reached (search for terminating_scope in region).

Contributor Author

Zoxc Sep 8, 2017

We may want to allow generators to return references to locals, in which case having DropBomb is live across the suspension point would be useful. I changed the ordering so the suspension point comes after the yield subexpression making &DropBomb and DropBomb a part of the generator interior.

arielb1 reviewed

View reviewed changes

src/librustc/middle/region.rs Outdated

    
            @@ -735,6 +751,8 @@ fn resolve_stmt<'a, 'tcx>(visitor: &mut RegionResolutionVisitor<'a, 'tcx>, stmt:
          
              fn resolve_expr<'a, 'tcx>(visitor: &mut RegionResolutionVisitor<'a, 'tcx>, expr: &'tcx hir::Expr) {

                  debug!("resolve_expr(expr.id={:?})", expr.id);

                  visitor.expr_count += 1;

Contributor

arielb1 Sep 7, 2017 •

edited

Loading

I find this mechanism somewhat hacky (traversing the CFG in preorder to gather this ordering). It would be nice to at least justify why it works:

execution can't go backwards in node postorder, except through a backedge (aka loop)
if node A is preorder-before node B, but node B is postorder-before node A, then A is a parent of B. This comes from the definition of a DFS.

Therefore, assuming that expr A is preorder-after a yield, we want to show that the results of expr A can't be used "over" the yield. There are 2 cases:

if expr A is postorder-after the yield, then it can only reach the yield across a loop it is in. But a loop is a terminating scope and drops all temporaries created within it.
if the yield is postorder-after expr A, then the yield is a parent of expr A. If we make yields a terminating scope (see the borrow-in-yield-expr example), then the terminating scope would destroy the temporary created by expr A.

Contributor

arielb1 Sep 7, 2017

I'm still a bit uncomfortable with the hackiness here.

Contributor

arielb1 Sep 7, 2017 •

edited

Loading

~~Oops. Assignments are evaluated RTL, so you can do~~

    *{ yield; x } = &y;

~~And &y would be stored over the yield.~~

~~Maybe use the old CFG and do dataflow over it? That would be ugly, but work.~~

Contributor

arielb1 Sep 7, 2017 •

edited

Loading

Visitor goes over subexpressions in RTL, so this actually works

Now there's this (archaic in incorrect but correct-in-spirit) comment, which justifies the property:

rust/src/librustc/hir/intravisit.rs

Lines 29 to 34 in d93036a

    
           //! Note: it is an important invariant that the default visitor walks 
        
           //! the body of a function in "execution order" (more concretely, 
        
           //! reverse post-order with respect to the CFG implied by the AST), 
        
           //! meaning that if AST node A may execute before AST node B, then A 
        
           //! is visited first.  The borrow checker in particular relies on this 
        
           //! property.

Contributor

arielb1 commented Sep 7, 2017

Note that MIR regionck won't much improve the situation - we'll still want to figure out interior types before we finish type-checking.

Contributor

arielb1 commented Sep 7, 2017 •

edited

Loading

I think the pre-order numbering annoys my aesthetic sense more than the CFG preorder traversal. If you can find a nice way of doing this without matching numbers between the 2 visitors, that would make me more comfortable.

The CFG ordering thing looks fairly easy to guarantee going forward, and because we need something like it to avoid MIR regionck dependency cycles, we probably want that.

alexcrichton added the S-waiting-on-author label

Contributor

bors commented Sep 8, 2017

☔ The latest upstream changes (presumably #43742) made this pull request unmergeable. Please resolve the merge conflicts.

Contributor

nikomatsakis commented Sep 8, 2017

Can you elaborate btw on what is wrong with the comment regarding the traversal being a "reverse-post-order on the CFG"? I think on IRC you said it was impossible to do, but I'm not a priori sure why that should be.

Regarding your explanation in this comment: what you wrote makes sense, but I'm feeling a bit confused about some of the particulars of your terminology. For example, when you talk about a "pre-order", do you mean on HIR tree or on the underlying control-flow-graph we will generate? I sort of assume that you meant the tree.

But of course the property we are interested in is most naturally expressed in terms of the CFG: at the point where the yield occurs, what temporaries may still be live?

I guess I am still trying to work out the best way to understand the property we are proving here. Clearly we are trying to relate the visit order over the HIR to the eventual execution paths in the CFG.

I think it'd be helpful to me to try and state those properties in terms of CFG nodes and relate them to the HIR. For example, a given HIR node has "parts" of its execution occuring at different times relative to its children, so I'm not 100% sure what it means for execution to go backwards or forwards relative to the postorder on the HIR tree.

So maybe we can use some terminology like this: Let us use start(H) to denote the start point of a HIR node H in the graph (this, I believe, is a unique point in the CFG). We will use ends(H) to denote the end-points where we exit the HIR node H. This is not a unique point because of unwinding and other "abrupt" exits from the HIR node (e.g., break).

In this case, I believe that some basic properties like this hold (unless I'm overlooking something):

start(H) dominates all the nodes in ends(H)
if C is some (possibly transitive) child of H, then I think that start(H) dominates start(C)
- this follows from not having branches into the middle of other nodes, but only to the start of exit of our ancestors in the tree
every path from start(C) to the end of the control-flow graph must pass through some node in ends(H)

But I'm not quite sure where to go from here yet. =)

Zoxc force-pushed the yield-order branch from fdbe742 to 36a6a54 Compare

September 9, 2017 09:12

Contributor Author

Zoxc commented Sep 10, 2017

I'm not too concerned about the soundness of this approach. We could add a check in borrowck which catches borrows across yields and the sanity check in the generator transformation will catch types live across suspension points. So if this is wrong, we could ICE instead of generating bad code.

I'm not sure if this is the approach I'd like long term for calculating generator interiors types. I want to generate MIR during typeck, possibly run drop elaboration on that, and use a liveness analysis on that MIR which is aware of (NLL) borrows.

Contributor

arielb1 commented Sep 10, 2017

Can you elaborate btw on what is wrong with the comment regarding the traversal being a "reverse-post-order on the CFG"?

First, this is a visitor, which makes a tree of calls to visit. Therefore, you can use it to get both pre-order and post-order traversal, depending on which way you set up your code.

Actually, I think a post-order traversal of the tree is compatible with an RPO traversal of the CFG, so that's actually correct.

Regarding your explanation in this comment

pre-order/post-order indeed referred to pre-order/post-order in the tree.

By inspection of the CFG, CFG edges only go "forward" in the post-order tree traversal, except for loop back edges (e.g. if we see that tree post-order is actually RPO, then the only edges that go against RPO are loop back edges). Such a back-edge going over a temporary would erase it.

Member

alex commented Sep 11, 2017

This resolves #44200 and maybe #44257, is that right?

Contributor Author

Zoxc commented Sep 11, 2017

@alex It resolves both of those and case 1 in #44197

Contributor

arielb1 commented Sep 11, 2017 •

edited

Loading

I think it'd be helpful to me to try and state those properties in terms of CFG nodes and relate them to the HIR.

We can map CFG nodes to their respective HIR node. Let call that map H(c) = h. That is, obviously, not an injective map. That allows us to talk of "the current HIR node execution is in".

We may care about which HIR nodes are "currently being executed". Because HIR nodes can't overlap, we can define START(h) and END(h) sets of CFG nodes (actually START might always be 1 element), where you can't go from START(h) to START(h) without going through END(h) (because you can't "overlap" execution of multiple instances of the same HIR node).

What we care about here is the set of CFG nodes during which "borrowck operations" take place. Let's call that BORROWCK(h). By inspection, we can see that all borrowck-relevant operations run at the very end of a HIR node: if control leaves a HIR node after a borrowck-relevant operation, control can't return to the "same execution" of that HIR node again. You can't go from BORROWCK(h) to some other HIR node without going through END(h).

Contributor

nikomatsakis commented Sep 11, 2017

@arielb1 seems to have this well in-hand! The main thing I would like is some clear documentation of why we believe this to be correct, and some comments to help us maintain the needed invariants. =)

rust-highfive assigned arielb1 and unassigned nikomatsakis

Contributor

arielb1 commented Sep 12, 2017

Now with a proof. Turns out we don't need the backedge property after all.

r? @nikomatsakis

rust-highfive assigned nikomatsakis and unassigned arielb1

arielb1 force-pushed the yield-order branch 2 times, most recently from b2dd1cf to d3dd847 Compare

September 12, 2017 22:57

Arnavion commented Sep 13, 2017

(Just to connect to the issue that @alex opened against futures-await, this does not fix alexcrichton/futures-await#18 )

nikomatsakis approved these changes

View reviewed changes

Contributor

nikomatsakis left a comment

This makes sense to me. I left some questions about the proof and maybe a minor suggestion.

src/librustc/middle/region.rs

    
                  ///     we might visit `U` without ever getting to `D`).

                  ///     3. However, we guarantee that at each HIR point, each

                  ///     binding/temporary is always either always storage-live

                  ///     or always storage-dead. This is what is being guaranteed

Contributor

nikomatsakis Sep 14, 2017

This is true modulo moves, I suppose.

Contributor

arielb1 Sep 14, 2017

storage-live does not consider moves.

src/librustc/middle/region.rs Outdated

    
                  /// at yield-points at these indexes.

                  ///

                  /// Let's show that: let `D` be our binding/temporary and `U` be our

                  /// other HIR node, with `HIR-postorder(U) < HIR-postorder(D)`.

Contributor

nikomatsakis Sep 14, 2017

I think this comment would be improved with a simple example. Let me try to work one out live here:

Consider this Rust program. We want to check that, at the time in which yield x executes, the temporary storing 22 has not been initialized, because we don't want to store the 22 in our generator state (if it helps, imagine that 22 is rather some value with a destructor that has to execute).

foo(3, yield x, bar(&22))

This results in a tree like so:

HA: call
- HB: foo
- HC: 3
- HD: yield
  - HE: x
- HF: bar
  - HG: &
    - HH: 22

With the following traversal orders:

Preorder traversal: HA, HB, HC, HD, HE, HF, HG, HH
Postorder traversal: HB, HC, HE, HD, HH, HG, HF, HA

Therefore, here, we will let D be the HIR node HH, which creates a temporary whose scope is equal to HA. Then let U be the HIR node HD where the yield occurs. You can see that HD appears before HH in the post-order traversal. We therefore surmise that the temporary created by HH must not yet have been initialized.

To see why this is correct, consider:

(your bullet points here)

Contributor

nikomatsakis Sep 14, 2017

(I found it really helpful when reading the later logic to relate to that example)

src/librustc/middle/region.rs Outdated

    
                  ///

                  /// Then:

                  ///     1. From the ordering guarantee of HIR visitors (see

                  ///     `rustc::hir::intravisit`), `D` does not dominate `U`.

Contributor

nikomatsakis Sep 14, 2017

All we know is that U comes before D in the postorder, therefore we know that U may end before D ends in execution order. From this we are concluding that End(D) does not dominate End(U), right? That makes sense.

There is a kind of unwritten assumption that the value produced by D is not produced until End(D), I think, right? (Which makes sense.)

Contributor

arielb1 Sep 14, 2017

There is a kind of unwritten assumption that the value produced by D is not produced until End(D), I think, right? (Which makes sense.)

Yeah.

src/test/run-pass/generator/yield-in-args-rev.rs

    
            @@ -12,12 +12,10 @@
          
              fn foo(_a: (), _b: &bool) {}

              // Some examples that probably *could* be accepted, but which we reject for now.

              fn bar() {

Contributor

nikomatsakis Sep 14, 2017

I think a comment would be good here to explain the purpose of this test.

For example:

// Test that a yield occuring in the list of arguments *before* a borrow
// does not cause a borrowck error about having a live borrow.

Sorta obvious from the name but still useful sometimes.

Contributor Author

Zoxc commented Sep 14, 2017

@arielb1 changed the code so that only yields after the expression is considered. It makes sense that the result of foo(yield) isn't live during the suspension point, but I'm not quite sure how that works with box and placement expressions.

Contributor

arielb1 commented Sep 14, 2017 •

edited

Loading

I thought box was desugared before we get to this stage. Apparently it's not (placement new is, which is just ugly).

Contributor

arielb1 commented Sep 14, 2017 •

edited

Loading

And box creates a MIR value (through not a "borrowck-tracked" value) before its end point. So we have to track the "empty box" somehow. Maybe. At least to avoid the ICE.

Contributor

arielb1 commented Sep 14, 2017

I think desugared placement new just makes this ugly. It creates a "desugared" *mut T live over its interior, which means our generator is not Sync if it yields inside the placement new.

Contributor

nikomatsakis commented Sep 18, 2017

We could not desugar box I think because of type inference interactions. I sort of forget but I think maybe @eddyb had a good idea in that respect ...

Zoxc and others added 8 commits

September 20, 2017 16:36


          Fix HIR printing of yield

1e6ec9f


          Only consider yields coming after the expressions when computing gene…

3a511e0

…rator interiors


          Sanity check the Expr visitation count

dba2ca8


          Remove debug statements

0bf7b55


          Mark yields after visiting subexpressions. Never ignore yields for sc…

0bb3dc1

…opes in bindings.


          add proofs and fix postorder traversal

5c0feb8

I don't think the "quasi-postorder" travesal could cause any issues, but
there's no reason for it to stay broken.


          address review comments

018525e


          rebase fixup

2384dd9

arielb1 force-pushed the yield-order branch from d3dd847 to 2384dd9 Compare

September 20, 2017 14:29

Contributor

nikomatsakis commented Sep 20, 2017

Contributor

bors commented Sep 20, 2017

📌 Commit 2384dd9 has been approved by nikomatsakis

Contributor

bors commented Sep 20, 2017

⌛ Testing commit 2384dd9 with merge 870483a...

bors added a commit that referenced this pull request


          Auto merge of #44392 - Zoxc:yield-order, r=nikomatsakis

870483a

Only consider yields coming after the expressions when computing generator interiors

When looking at the scopes which temporaries of expressions can live for during computation of generator interiors, only consider yields which appear after the expression in question in the HIR.

Contributor

bors commented Sep 21, 2017

☀️ Test successful - status-appveyor, status-travis
Approved by: nikomatsakis
Pushing 870483a to master...

bors merged commit 2384dd9 into rust-lang:master

This was referenced Sep 22, 2017

Generators requiring temporaries live too long #44257

Closed

Generators require extra-long lifetime for return statement #44200

Closed

rubdos mentioned this pull request

Very weird borrow checker error alexcrichton/futures-await#10

Closed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

S-waiting-on-author