Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reevaluate use of Store/references for recursive variables #1948

Closed
byorgey opened this issue Jun 16, 2024 · 4 comments · Fixed by #1949
Closed

Reevaluate use of Store/references for recursive variables #1948

byorgey opened this issue Jun 16, 2024 · 4 comments · Fixed by #1949
Labels
C-Moderate Effort Should take a moderate amount of time to address. L-Elaboration Elaboration takes typechecked code and manipulates it (inserts extra code, etc.) to be ready to run. S-Nice to have The bug fix or feature would be nice but doesn't currently have much negative impact. Z-Refactoring This issue is about restructuring the code without changing the behaviour to improve code quality. Z-Research This issue requires inventing something new and/or studying research papers for clues.

Comments

@byorgey
Copy link
Member

byorgey commented Jun 16, 2024

I forget why I did it this way, and I am not sure that it is really necessary. It certainly introduces extra complication (e.g. the need to wrap free variables in force during elaboration). We should either get rid of it, or add some clear comments explaining why it is necessary.

@byorgey byorgey added Z-Refactoring This issue is about restructuring the code without changing the behaviour to improve code quality. C-Moderate Effort Should take a moderate amount of time to address. S-Nice to have The bug fix or feature would be nice but doesn't currently have much negative impact. L-Elaboration Elaboration takes typechecked code and manipulates it (inserts extra code, etc.) to be ready to run. Z-Research This issue requires inventing something new and/or studying research papers for clues. labels Jun 16, 2024
@byorgey
Copy link
Member Author

byorgey commented Jun 16, 2024

OK, it is necessary, and I remember/understand why now after trying the experiment of removing them. I will write up an explanation here, and leave this open until I have added more comments in the source code (I will wait until after merging #1928).

Evaluating a non-recursive let, e.g. let x = 1 + 2 in x * x, is easy: we first evaluate the definition 1 + 2 under the current environment; then, once we get the value 3 back, we evaluate the body x * x, but under an extended environment that adds the binding x -> 3.

To evaluate a recursive let, of course the big problem is that we seemingly need to know the value of the variable being defined while evaluating that value itself. For example, when evaluating let fac = \n. if (n == 0) {1} {n * fac (n-1)} in fac 5, what environment do we use for evaluating the definition \n. if (n == 0) {1} {n * fac (n-1)}? Since fac is in scope, we need to put some value for it into the environment but we don't have a value yet. But wait... do we really need to put something in the environment? Since the language is strict, any valid recursive definition in fact will not need to look at the value of the variable being defined. For example, in this case, fac is defined as a lambda expression, which is already a value without looking at its body; we don't need to look at fac while computing its value after all.

However, we do actually need to put something in the environment! Consider a (bogus) definition like let x = 3 * x in x + 1. This typechecks, but trying to evaluate it should either go into an infinite loop or throw some kind of "infinite loop detected" error. We cannot simply leave x out of the environment when evaluating its definition, since then the interpreter would crash with an "undefined variable" error.

One potential solution is to put some special "blackhole" value for the variable into the context. If this value is ever encountered, that means the value of the variable was needed to compute its own value, and we can throw an "infinite loop detected" error. So the idea is this: when evaluating let x = y in z, we put x -> Blackhole in the environment while evaluating y, and then, once we get some value v for x, we put x -> v in the environment while evaluating z.

This sounds great, but it's actually subtly wrong! The problem is that the value of something like a lambda is a closure that captures its current environment. This is absolutely critical to make lambdas work correctly. But it also means that, in this example, the value of fac is a closure which contains the definition fac -> blackhole! So calling fac 0 would work fine, since we would look up fac in the new environment that maps it to the closure. But fac 1 will fail with an "infinite loop detected" error, since we evaluate the body of the lambda under the environment captured at the time we made the closure; when it goes to call fac recursively it looks it up in this original environment, in which it finds fac -> blackhole.

The simplest solution is to have fac actually map to a mutable reference cell which starts out with the special "blackhole" value, but we can later update to the real value. This ensures that all references to the variable being defined, even ones captured by closures etc., will see the new, updated value.

All that is well and good, but I still think there may be an opportunity to simplify the code --- I am not sure whether we need the complication of incorporating delay/force. i.e. maybe we can get away with memory cells in the store having only two possible states: Blackhole and Value. Currently we have E Term Env which has to have force applied.

@byorgey
Copy link
Member Author

byorgey commented Jun 16, 2024

This old discussion is relevant: #150

@byorgey
Copy link
Member Author

byorgey commented Jun 17, 2024

After reading the discussion linked above, I can see why we might want the "memoized delay" for building lazy data structures, but I think it was a mistake to use this mechanism for evaluating recursive bindings. Currently, to evaluate a recursive let, we allocate a memory cell and store an unevaluated (delayed) expression in it. It does not get evaluated until the first time the binding is actually used. However, this does not really add anything as far as I can tell, and it has a definite cost: we have to wrap every use of such variables in force, which is tricky to get right, makes pretty-printing of elaborated terms ugly, and confuses anyone who prints out a value and sees force somewhere they did not write it.

I think we should be able to keep the memoized delay mechanism in place but stop using it for recursive bindings. I will give it a shot.

@byorgey
Copy link
Member Author

byorgey commented Jun 17, 2024

Actually, I now think we should just

  1. get rid of the "memoizing delay"
  2. have memory cells hold only (1) blackholes or (2) values

Once we add #1660, programmers can implement their own laziness, i.e. memoizing delay (if they want to e.g. make lazy infinite streams or whatever), as follows. Assume we have primitives

ref : a -> Ref a
write : Ref a -> a -> Cmd Unit
read : Ref a -> Cmd a

where Ref a is the type of references to memory cells holding a value of type a. Then we have

tydef Lazy a = Ref (a + {a}) end

def lazy : {a} -> Lazy a =
  \a. ref (inr a)
end

def readLazy : Lazy a -> Cmd a = \l.
  r <- read l;
  case r
    return
    (\d. let a = force d
         in  write l (inl a); return a
    )
end

I am not sure whether we can get away with leaving off the Cmd from some of these primitives, but in any case that's the basic idea.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-Moderate Effort Should take a moderate amount of time to address. L-Elaboration Elaboration takes typechecked code and manipulates it (inserts extra code, etc.) to be ready to run. S-Nice to have The bug fix or feature would be nice but doesn't currently have much negative impact. Z-Refactoring This issue is about restructuring the code without changing the behaviour to improve code quality. Z-Research This issue requires inventing something new and/or studying research papers for clues.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant