Abstract Heresies: April 2021

Monday, April 19, 2021

η-conversion and tail recursion

Consider this lambda expression: (lambda (x) (sqrt x)). This function simply calls sqrt on its argument and returns whatever sqrt returns. There is no argument you could provide to this function that would cause it to return a different result than you would get from calling sqrt directly. We say that this function and the sqrt function are extensionally equal. We can replace this lambda expression with a literal reference to the sqrt function without changing the value produced by our code.

You can go the other way, too. If you find a literal reference to a function, you can replace it with a lambda expression that calls the function. This is η-conversion. η-reduction is removing an unnecessary lambda wrapper, η-expansion is introducting one.

η-conversion comes with caveats. First, it only works on functions. If I have a string "foo", and I attempt to η-expand this into (lambda (x) ("foo" x)), I get nonsense. Second, a reduction strategy that incorporates η-reduction can be weaker than one that does not. Consider this expression: (lambda (x) ((compute-f) x)). We can η-reduce this to (compute-f), but this makes a subtle difference. When wrapped with the lambda, (compute-f) is evaluated just before it is applied to x. In fact, we won't call (compute-f) unless we invoke the result of the lambda expression somewhere. However, once η-reduced, (compute-f) is evaluated at the point the original lambda was evaluated, which can be quite a bit earlier.

When a function foo calls another function bar as a subproblem, an implicit continuation is passed to bar. bar invokes this continuation on the return value that it computes. We can characterize this continuation like this:

k_bar = (lambda (return-value)
         (k_foo (finish-foo return-value)))

this just says that when bar returns, we'll finish running the code in foo and further continue by invoking the continuation supplied to foo.

If foo makes a tail call to bar, then foo is just returning what bar computes. There is no computation for foo to finish, so the continuation is just

k_bar = (lambda (return-value)
         (k_foo return-value))

But this η-reduces to just k_foo, so we don't have to allocate a new trivial continuation when foo tail calls bar, we can just pass along the continuation that was passed to foo.

Tail recursion is equivalent to η-reducing the implicit continuations to functions where possible. A Scheme aficionado might prefer to say avoiding η-expanding where unnecessary.

This is a mathematical curiosity, but does it have practical significance? If you're programming in continuation passing style, you should be careful to η-reduce (or avoid η-expanding) your code.

Years ago I was writing an interpreter for the REBOL language. I was getting frustrated trying to make it tail recursive. I kept finding places in the interpreter where the REBOL source code was making a tail call, but the interpreter itself wasn't, so the stack would grow without bound. I decided to investigate the problem by rewriting the interpreter in continuation passing style and seeing why I couldn't η-convert the tail calls. Once in CPS, I could see that eval took two continuations and I could achieve tail recursion by η-reducing one of them.

Saturday, April 10, 2021

Can continuation passing style code perform well?

Continuation passing style is a powerful technique that allows you to abstract over control flow in your program. Here is a simple example: We want to look things up in a table, but sometimes the key we use is not associated with any value. In that case, we have to do something different, but the lookup code doesn't know what the caller wants to do, and the caller doesn't know how the lookup code works. Typically, we would arrange for the lookup code to return a special “key not found” value:

(let ((answer (lookup key table)))
   (if (eq answer 'key-not-found)
       ... handle missing key ...    
       ... compute something with answer...)

There are two minor problems with this approach. First, the “key not found” value has to be within the type returned by lookup. Consider a table that can only contain integers. Unfortunately, we cannot declare answer to be an integer because it might be the “key not found” value. Alternatively, we might decide to reserve a special integer to indicate “key not found”. The answer can then be declared an integer, but there is now a magic integer that cannot be stored in the table. Either way, answer is a supertype of what can be stored in the table, and we have to project it back down by testing it against “key not found”.

The second problem is one of redundancy. Presumably, somewhere in the code for lookup there is a conditional for the case that the key hasn't been found. We take a branch and return the “key not found” value. But now the caller tests the return value against “key not found” and it, too, takes a branch. We only take the true branch in the caller if the true branch was taken in the callee and we only take the false branch in the caller if the false branch was taken in the callee. In essence, we are branching on the exact same condition twice. We've reified the control flow, injected the reified value into the space of possible return values, passed it through the function call boundary, then projected and reflected the value back into control flow at the call site.

If we write this in continuation passing style, the call looks like this

(lookup key table
   (lambda (answer)
     …compute something with answer…)
   (lambda ()
     …handle missing key…))

lookup will invoke the first lambda expression on the answer if it is found, but it will invoke the second lambda expression if the answer is not found. We no longer have a special “key not found” value, so answer can be exactly the type of what is stored in the table and we don't have to reserve a magic value. There is also no redundant conditional test in the caller.

This is pretty cool, but there is a cost. The first is that it takes practice to read continuation passing style code. I suppose it takes practice to read any code, but some languages make it extra cumbersome to pass around the lambda expressions. (Some seem actively hostile to the idea.) It's just more obscure to be passing around continuations when direct style will do.

The second cost is one of performance and efficiency. The lambda expressions that you pass in to a continuation passing style program will have to be closed in the caller's environment, and this likely means storage allocation. When the callee invokes one of the continuations, it has to perform a function call. Finally, the lexically scoped variables in the continuation will have to be fetched from the closure's environment. Direct style performs better because it avoids all the lexical closure machinery and can keep variables in the local stack frame. For these reasons, you might have reservations about writing code in continuation passing style if it needs to perform.

Continuation passing style looks complicated, but you don't need a Sufficiently Smart^™ compiler to generate efficient code from it. Here is lookup coded up to illustrate:

(defun lookup (key table if-found if-not-found)
   (labels ((scan-entries (entries)
              (cond ((null entries) (funcall if-not-found))
                    ((eq (caar entries) key) (funcall if-found (cdar entries)))
                    (t (scan-entries (cdr entries))))))
     (scan-entries table)))

and a sample use might be

(defun probe (thing)
  (lookup thing *special-table*
    (lambda (value) (format t "~s maps to ~s." thing value))
    (lambda () (format t "~s is not special." thing))))

Normally, probe would have to allocate two closures to pass in to lookup, and the code in each closure would have to fetch the lexical value of key from the closure. But without changing either lookup or probe we can (declaim (inline lookup)). Obviously, inlining the call will eliminate the overhead of a function call, but watch what happens to the closures:

(defun probe (thing)
  ((lambda (key table if-found if-not-found)
     (labels ((scan-entries (table)
                (cond ((null entries) (funcall if-not-found))
                      ((eq (caar entries) key) (funcall if-found (cdar entries)))
                      (t (scan-entries (cdr entries))))))
        (scan-entries table)))
    thing *special-table*
    (lambda (value) (format t "~s maps to ~s." thing value))
    (lambda () (format t "~s has no mapping." thing))))

A Decent Compiler^™ will easily notice that key is just an alias for thing and that table is just an alias for *special-table*, so we get:

(defun probe (thing)
  ((lambda (if-found if-not-found)
     (labels ((scan-entries (entries)
                (cond ((null entries) (funcall if-not-found))
                      ((eq (caar entries) thing) (funcall if-found (cdar entries)))
                      (t (scan-entries (cdr entries))))))
        (scan-entries *special-table*)))
    (lambda (value) (format t "~s maps to ~s." thing value))
    (lambda () (format t "~s has no mapping." thing))))

and the expressions for if-found and if-not-found are side-effect free, so they can be inlined (and we expect the compiler to correctly avoid unexpected variable capture):

(defun probe (thing)
  ((lambda ()
     (labels ((scan-entries (entries)
                (cond ((null entries)
                       (funcall (lambda () (format t "~s has no mapping." thing))))
                      ((eq (caar entries) thing)
                       (funcall (lambda (value) (format t "~s maps to ~s." thing value))
                                (cdar entries)))
                      (t (scan-entries (cdr entries))))))
        (scan-entries *special-table*)))))

and the immediate calls to literal lambdas can be removed:

(defun probe (thing)
   (labels ((scan-entries (entries)
              (cond ((null entries) (format t "~s has no mapping." thing))
                    ((eq (caar entries) thing)
                     (format t "~s maps to ~s." thing (cdar value))))
                    (t (scan-entries (cdr entries))))))
      (scan-entries *special-table*)))

Our Decent Compiler^™ has removed all the lexical closure machinery and turned the calls to the continuations into direct code. This code has all the features we desire: there is no special “key not found” value to screw up our types, there is no redundant branch: the (null entries) test directly branches into the appropriate handling code, we do not allocate closures, and the variables that would have been closed over are now directly apparent in the frame.

It's a bit vacuous to observe that an inlined function performs better. Of course it does. At the very least you avoid a procedure call. But if you inline a continuation passing style function, any Decent Compiler^™ will go to town and optimize away the continuation overhead. It's an unexpected bonus.

On occasion I find that continuation passing style is just the abstraction for certain code that is also performance critical. I don't give it a second thought. Continuation passing style can result in high-performance code if you simply inline the critical calls.

Saturday, April 3, 2021

Early LISP Part II (Apply redux)

By April of 1959, issues with using subst to implement β-reduction became apparent. In the April 1959 Quarterly Progress Report of the Research Laboratory of Electronics, McCarthy gives an updated definition of the universal S-function apply:

    apply[f;args]=eval[cons[f;appq[args]];NIL]
where
    appq[m]=[null[m]→NIL;T→cons[list[QUOTE;car[m]];appq[cdr[m]]]]
and
       eval[e;a]=[
atom[e]→eval[assoc[e;a];a];
atom[car[e]]→[
car[e]=QUOTE→cadr[e];
car[e]=ATOM→atom[eval[cadr[e];a]];
car[e]=EQ→[eval[cadr[e];a]=eval[caddr[e];a]];
car[e]=COND→evcon[cdr[e];a];
car[e]=CAR→car[eval[cadr[e];a]];
car[e]=CDR→cdr[eval[cadr[e];a]];
car[e]=CONS→cons[eval[cadr[e];a];eval[caddr[e];a]];
T→eval[cons[assoc[car[e];a];evlis[cdr[e];a]];a]];
caar[e]=LABEL→eval[cons[caddar[e];cdr[e]];cons[list[cadar[e];car[e]];a]];
caar[e]=LAMBDA→eval[caddar[e];append[pair[cadar[e];cdr[e]];a]]

and
    evcon[c;a]=[eval[caar[c];a]→eval[cadar[c];a];T→evcon[cdr[c];a]]
and
    evlis[m;a]= [null[m]→NIL;T→cons[list[QUOTE;eval[car[m];a]];
evlis[cdr[m];a]]

I find this a lot easier to understand if we transcribe it into modern Common LISP:

;;; Hey Emacs, this is -*- Lisp -*-

(in-package "CL-USER")

;; Avoid smashing the standard definitions.
(shadow "APPLY")
(shadow "ASSOC")
(shadow "EVAL")

(defun apply (f args)
  (eval (cons f (appq args)) nil))

(defun appq (m)
  (cond ((null m) nil)
        (t (cons (list 'QUOTE (car m)) (appq (cdr m))))))

(defun eval (e a)
  (cond ((atom e) (eval (assoc e a) a))
        ((atom (car e))
         (cond ((eq (car e) 'QUOTE) (cadr e))
               ((eq (car e) 'ATOM)  (atom (eval (cadr e) a)))
               ((eq (car e) 'EQ)    (eq (eval (cadr e) a) (eval (caddr e) a)))
               ((eq (car e) 'COND)  (evcon (cdr e) a))
               ((eq (car e) 'CAR)   (car (eval (cadr e) a)))
               ((eq (car e) 'CDR)   (cdr (eval (cadr e) a)))
               ((eq (car e) 'CONS)  (cons (eval (cadr e) a) (eval (caddr e) a)))
               (t (eval (cons (assoc (car e) a) (evlis (cdr e) a)) a))))
        ((eq (caar e) 'LABEL) (eval (cons (caddar e) (cdr e))
                                    (cons (list (cadar e) (car e)) a)))
        ((eq (caar e) 'LAMBDA) (eval (caddar e)
                                     (append (pair (cadar e) (cdr e)) a)))))

(defun evcon (c a)
  (cond ((eval (caar c) a) (eval (cadar c) a))
        (t (evcon (cdr c) a))))

(defun evlis (m a)
  (cond ((null m) nil)
        (t (cons (list 'QUOTE (eval (car m) a)) (evlis (cdr m) a)))))

;;; Modern helpers
(defun assoc (k l)
  (cadr (cl:assoc k l)))

(defun pair (ls rs)
  (map 'list #'list ls rs))

(defun testit ()
  (apply '(label ff (lambda (x) (cond ((atom x) x) ((quote t) (ff (car x))))))
         (list '((a . b) . c))))

There are a few things to notice about this. First, there is no code that inspects the value cell or function cell of a symbol. All symbols are evaluated by looking up the value in the association list a, so this evaluator uses one namespace. Second, the recursive calls to eval when evaluating combinations (the last clause of the inner cond and the LABEL and LAMBDA clauses) are in tail position, so this evaluator could be coded up tail-recursively. (It is impossible to say without inspecting the IBM 704 assembly code.)

What is most curious about this evaluator is the first clause in the outer cond in eval. This is where variable lookup happens. As you can see, we look up the variable by calling assoc, but once we obtain the value, we call eval on it. This LISP isn't storing values in the environment, but rather expressions that evaluate to values. If we look at the LAMBDA clause of the cond, the one that handles combinations that begin with lambda expressions, we can see that it does not evaluate the arguments to the lambda but instead associates the bound variables with the arguments' expressions. This therefore has call-by-name semantics rather than the modern call-by-value semantics.

By April 1960 we see these changes:

(defun eval (e a)
  (cond ((atom e) (assoc e a))
        ((atom (car e))
         (cond ((eq (car e) 'QUOTE) (cadr e))
               ((eq (car e) 'ATOM)  (atom (eval (cadr e) a)))
               ((eq (car e) 'EQ)    (eq (eval (cadr e) a) (eval (caddr e) a)))
               ((eq (car e) 'COND)  (evcon (cdr e) a))
               ((eq (car e) 'CAR)   (car (eval (cadr e) a)))
               ((eq (car e) 'CDR)   (cdr (eval (cadr e) a)))
               ((eq (car e) 'CONS)  (cons (eval (cadr e) a) (eval (caddr e) a)))
               (t (eval (cons (assoc (car e) a) (evlis (cdr e) a)) a))))
        ((eq (caar e) 'LABEL) (eval (cons (caddar e) (cdr e))
                                    (cons (list (cadar e) (car e)) a)))
        ((eq (caar e) 'LAMBDA) (eval (caddar e)
                                     (append (pair (cadar e) (evlis (cdr e) a)) a)))))

Note how evaluating an atom now simply looks up the value of the atom in the association list and evaluation of a combination of a lambda involves evaluating the arguments eagerly. This is a call-by-value interpreter.