Friday, May 21, 2021

Stupid Y operator tricks

Here is the delta function: δ = (lambda (f) (f f)). Delta takes a function and tail calls that function on itself. What happens if we apply the delta function to itself? Since the delta function is the argument, it is tail called and applied to itself. Which leads again to itself being tail called and applied to itself. We have a situation of infinite regression: the output of (δ δ) ends up being a restatement of the output of (δ δ). Now in this case, regression is infinite and there is no base case, but imagine that somehow there were a base case, or that somehow we identified a value that an infinite regression equated to. Then each stage of the infinite regression just replicates the previous stage exactly. It is like having a perfectly silvered mirror: it just replicates the image presented to it exactly. By calling delta on delta, we've arranged our perfectly silvered mirror to reflect an image of itself. This leads to the “infinite hall of mirrors” effect.

So let's tweak the delta function so that instead of perfectly replicating the infinite regression, it applies a function g around the replication: (lambda (f) (g (f f))). If we apply this modified delta function to itself, each expansion of the infinite regression ends up wrapping an application of the g around it: (g (f f)) = (g (g (f f))) = (g (g (g (f f)))) = (g (g (g (g … )))). So our modified delta function gives us a nested infinite regression of applications of g. This is like our perfectly silvered mirror, but now the reflected image isn't mirrored exactly: we've put a frame on the mirror. When we arrange for the mirror to reflect itself, each nested reflection also has an image of the frame around the reflection, so we get a set of infinitely nested frames.

An infinite regression of (g (g (g (g … )))) is confusing. What does it mean? We can untangle this by unwrapping an application. (g (g (g (g … )))) is just a call to g. The argument to that call is weird, but we're just calling (g <something>). The result of the infinite regression (g (g (g (g … )))) is simply the result of the outermost call to g. We can use this to build a recursive function.

;; If factorial = (g (g (g (g … )))), then
;; factorial = (g factorial), where

(defun g (factorial)
  (lambda (x)
    (if (zerop x)
        1
        (* x (funcall factorial (- x 1))))))
The value returned by an inner invocation of g is the value that will be funcalled in the altenative branch of the conditional.

Y is defined thus:

Y = λg.(λf.g(f f))(λf.g(f f))

A straightforward implementation attempt would be
;; Non working y operator
(defun y (g)
  (let ((d (lambda (f) (funcall g (funcall f f)))))
    (funcall d d)))
but since lisp is a call-by-value language, it will attempt to (funcall f f) before funcalling g, and this will cause runaway recursion. We can avoid the runaway recursion by delaying the (funcall f f) with a strategically placed thunk
;; Call-by-value y operator
;; returns (g (lambda () (g (lambda () (g (lambda () … ))))))
(defun y (g)
  (let ((d (lambda (f) (funcall g (lambda () (funcall f f))))))
    (funcall d d)))
Since the recursion is now wrapped in a thunk, we have to funcall the thunk to force the recursive call. Here is an example where we see that:
* (funcall (Y (lambda (thunk)
                (lambda (x)
                  (if (zerop x)
                      1
                      (* x (funcall (funcall thunk) (- x 1)))))))
           6)
720
the (funcall thunk) invokes the thunk in order to get the actual recursive function, which we when then funcall on (- x 1).

By wrapping the self-application with a thunk, we've made the call site where we use the thunk more complicated. We can clean that up by wrapping the call to the thunk in something nicer:

* (funcall
    (y (lambda (thunk)
         (flet ((factorial (&rest args)
                  (apply (funcall thunk) args)))

           (lambda (x)
             (if (zerop x)
                 1
                 (* x (factorial (- x 1))))))))
      6)
720
And we can even go so far as to hoist that wrapper back up into the definiton of y
(defun y1 (g)
  (let ((d (lambda (f) (funcall g (lambda (&rest args) (apply (funcall f f) args))))))
    (funcall d d)))

* (funcall
    (y1 (lambda (factorial)
          (lambda (x)           
            (if (zerop x)
                1
                (* x (funcall factorial x))))))
    6)
720
y1 is an alternative formulation of the Y operator where we've η-expanded the recursive call to avoid the runaway recursion.

The η-expanded version of the applicative order Y operator has the advantage that it is convenient for defining recursive functions. The thunkified version is less convenient because you have to force the thunk before using it, but it allows you to use the Y operator to define recursive data structures as well as functions:

(Y
  (lambda (delayed-ones)
    (cons-stream 1 (delayed-ones))))
{1 …}

The argument to the thunkified Y operator is itself a procedure of one argument, the thunk. Y returns the result of calling its argument. Y should return a procedure, so the argument to Y should return a procedure. But it doesn't have to immediately return a procedure, it just has to eventually return a procedure, so we could, for example, print something before returning the procedure:

* (funcall (Y (lambda (thunk)
                (format t "~%Returning a procedure")
                (lambda (x)
                  (if (zerop x)
                      1
                      (* x (funcall (funcall thunk) (- x 1)))))))
         6)
Returning a procedure
Returning a procedure
Returning a procedure
Returning a procedure
Returning a procedure
Returning a procedure
720
There is one caveat. You must be able to return the procedure without attempting to make the recursive call.

Let's transform the returned function before returning it by applying an arbitrary function h to it:

(Y (lambda (thunk)
     (h (lambda (x)
          (if (zerop x)
              1
              … )))))
Ok, so now when we (funcall thunk) we don't get what we want, we've got an invocation of h around it. If we have an inverse to h, h-1, available, we can undo it:
(y (lambda (thunk)
      (h (lambda (x)
           (if (zerop x)
               1
               (* (funcall (h-1 (funcall thunk)) (- x 1))))))))
As a concrete example, we return a list and at the call site we extract the first element of that list before calling it:
* (funcall (car (y (lambda (thunk)
                     (list (lambda (x)
                             (if (zerop x)
                                 1
                                 (* x (funcall (car (funcall thunk)) (- x 1))))))))
           6)
720
So we can return a list of mutually recursive functions:
(y (lambda (thunk)
    (list
      ;; even?
      (lambda (n)
        (or (zerop n)
            (funcall (cadr (funcall thunk)) (- n 1))))

      ;; odd?
      (lambda (n)
        (and (not (zerop n))
             (funcall (car (funcall thunk)) (- n 1))))
      )))
If we use the η-expanded version of the Y operator, then we can adapt it to expect a list of mutually recursive functions on the recursive call:
(defun y* (&rest g-list)
  (let ((d (lambda (f)
             (map 'list (lambda (g)
                          (lambda (&rest args)
                            (apply (apply g (funcall f f)) args)))
                  g-list))))
     (funcall d d)))
which we could use like this:
* (let ((eo (y* (lambda (even? odd?)
                  (declare (ignore even?))
                  (lambda (n)
                    (or (zerop n)
                        (funcall odd? (- n 1)))))

                (lambda (even? odd?)
                  (declare (ignore odd?))
                  (lambda (n)
                    (and (not (zerop n))
                         (funcall even? (- n 1))))))))
     (let ((even? (car eo))
           (odd?  (cadr eo)))
       (do ((i 0 (+ i 1)))
           ((>= i 5))
         (format t "~%~d, ~s ~s"
                 i
                 (funcall even? i)
                 (funcall odd? i)))))))
                 
0, T NIL
1, NIL T
2, T NIL
3, NIL T
4, T NIL
Instead of returning a list of mutually recursive functions, we could return them as multiple values. We just have to be expecting multiple values at the call site:
(defun y* (&rest gs)
  (let ((d (lambda (f)
             (apply #'values
                    (map 'list
                         (lambda (g)
                           (lambda (&rest args)
                             (apply (multiple-value-call g (funcall f f)) args)))
                         gs)))))
    (funcall d d)))

MIT Scheme used to have a construct called a named lambda. A named lambda has an extra first argument that is automatically filled in with the function itself. So during evaluation of the body of a named lambda, the name is bound to the named lambda, enabling the function to call itself recursively:

(defmacro named-lambda ((name &rest args) &body body)
  `(y1 (lambda (,name)
         (lambda ,args
           ,@body))))

* (funcall (named-lambda (factorial x)
            (if (zerop x)
                1
                (* x (funcall factorial (- x 1)))))
         6)
720
This leads us to named let expressions. In a named let, the implicit lambda that performs the let bindings is a named lambda. Using that name to invoke the lambda on a different set of arguments is like recursively re-doing the let.
* (named-let fact ((x 6)) (if (zerop x) 1 (* x (funcall fact (- x 1)))))

720

In Scheme, you use letrec to define recursive or mutually recursive procedures. Internal definitions expand into an appropriate letrec. letrec achieves the necessary circularity not through the Y operator, but through side effects. It is hard to tell the difference, but there is a difference. Using the Y operator would allow you to have recursion, but avoid the implicit side effects in a letrec.

Oleg Kiselyov has more to say about the Y operator at http://okmij.org/ftp/Computation/fixed-point-combinators.html

1 comment:

Michael South said...

In the "y1" definition, should "(funcall factorial x)" actually be "(funcall factorial (- x 1))"?