Reevaluate use of `Store`/references for recursive variables #1948

byorgey · 2024-06-16T21:55:35Z

I forget why I did it this way, and I am not sure that it is really necessary. It certainly introduces extra complication (e.g. the need to wrap free variables in force during elaboration). We should either get rid of it, or add some clear comments explaining why it is necessary.

The text was updated successfully, but these errors were encountered:

byorgey · 2024-06-16T23:32:22Z

OK, it is necessary, and I remember/understand why now after trying the experiment of removing them. I will write up an explanation here, and leave this open until I have added more comments in the source code (I will wait until after merging #1928).

Evaluating a non-recursive let, e.g. let x = 1 + 2 in x * x, is easy: we first evaluate the definition 1 + 2 under the current environment; then, once we get the value 3 back, we evaluate the body x * x, but under an extended environment that adds the binding x -> 3.

To evaluate a recursive let, of course the big problem is that we seemingly need to know the value of the variable being defined while evaluating that value itself. For example, when evaluating let fac = \n. if (n == 0) {1} {n * fac (n-1)} in fac 5, what environment do we use for evaluating the definition \n. if (n == 0) {1} {n * fac (n-1)}? Since fac is in scope, we need to put some value for it into the environment but we don't have a value yet. But wait... do we really need to put something in the environment? Since the language is strict, any valid recursive definition in fact will not need to look at the value of the variable being defined. For example, in this case, fac is defined as a lambda expression, which is already a value without looking at its body; we don't need to look at fac while computing its value after all.

However, we do actually need to put something in the environment! Consider a (bogus) definition like let x = 3 * x in x + 1. This typechecks, but trying to evaluate it should either go into an infinite loop or throw some kind of "infinite loop detected" error. We cannot simply leave x out of the environment when evaluating its definition, since then the interpreter would crash with an "undefined variable" error.

One potential solution is to put some special "blackhole" value for the variable into the context. If this value is ever encountered, that means the value of the variable was needed to compute its own value, and we can throw an "infinite loop detected" error. So the idea is this: when evaluating let x = y in z, we put x -> Blackhole in the environment while evaluating y, and then, once we get some value v for x, we put x -> v in the environment while evaluating z.

This sounds great, but it's actually subtly wrong! The problem is that the value of something like a lambda is a closure that captures its current environment. This is absolutely critical to make lambdas work correctly. But it also means that, in this example, the value of fac is a closure which contains the definition fac -> blackhole! So calling fac 0 would work fine, since we would look up fac in the new environment that maps it to the closure. But fac 1 will fail with an "infinite loop detected" error, since we evaluate the body of the lambda under the environment captured at the time we made the closure; when it goes to call fac recursively it looks it up in this original environment, in which it finds fac -> blackhole.

The simplest solution is to have fac actually map to a mutable reference cell which starts out with the special "blackhole" value, but we can later update to the real value. This ensures that all references to the variable being defined, even ones captured by closures etc., will see the new, updated value.

All that is well and good, but I still think there may be an opportunity to simplify the code --- I am not sure whether we need the complication of incorporating delay/force. i.e. maybe we can get away with memory cells in the store having only two possible states: Blackhole and Value. Currently we have E Term Env which has to have force applied.

byorgey · 2024-06-16T23:34:30Z

This old discussion is relevant: #150

byorgey · 2024-06-17T01:56:36Z

After reading the discussion linked above, I can see why we might want the "memoized delay" for building lazy data structures, but I think it was a mistake to use this mechanism for evaluating recursive bindings. Currently, to evaluate a recursive let, we allocate a memory cell and store an unevaluated (delayed) expression in it. It does not get evaluated until the first time the binding is actually used. However, this does not really add anything as far as I can tell, and it has a definite cost: we have to wrap every use of such variables in force, which is tricky to get right, makes pretty-printing of elaborated terms ugly, and confuses anyone who prints out a value and sees force somewhere they did not write it.

I think we should be able to keep the memoized delay mechanism in place but stop using it for recursive bindings. I will give it a shot.

byorgey · 2024-06-17T03:46:15Z

Actually, I now think we should just

get rid of the "memoizing delay"
have memory cells hold only (1) blackholes or (2) values

Once we add #1660, programmers can implement their own laziness, i.e. memoizing delay (if they want to e.g. make lazy infinite streams or whatever), as follows. Assume we have primitives

ref : a -> Ref a
write : Ref a -> a -> Cmd Unit
read : Ref a -> Cmd a

where Ref a is the type of references to memory cells holding a value of type a. Then we have

tydef Lazy a = Ref (a + {a}) end

def lazy : {a} -> Lazy a =
  \a. ref (inr a)
end

def readLazy : Lazy a -> Cmd a = \l.
  r <- read l;
  case r
    return
    (\d. let a = force d
         in  write l (inl a); return a
    )
end

I am not sure whether we can get away with leaving off the Cmd from some of these primitives, but in any case that's the basic idea.

byorgey mentioned this issue Jun 17, 2024

Get rid of memoizing delay + switch to a strict store #1949

Merged

mergify bot closed this as completed in #1949 Jun 19, 2024

mergify bot closed this as completed in eb156a6 Jun 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reevaluate use of `Store`/references for recursive variables #1948

Reevaluate use of `Store`/references for recursive variables #1948

byorgey commented Jun 16, 2024

byorgey commented Jun 16, 2024

byorgey commented Jun 16, 2024

byorgey commented Jun 17, 2024

byorgey commented Jun 17, 2024

Reevaluate use of Store/references for recursive variables #1948

Reevaluate use of Store/references for recursive variables #1948

Comments

byorgey commented Jun 16, 2024

byorgey commented Jun 16, 2024

byorgey commented Jun 16, 2024

byorgey commented Jun 17, 2024

byorgey commented Jun 17, 2024

Reevaluate use of `Store`/references for recursive variables #1948

Reevaluate use of `Store`/references for recursive variables #1948