Here is the delta function: δ = (lambda (f) (f f))
. Delta
takes a function and tail calls that function on itself. What
happens if we apply the delta function to itself? Since the delta
function is the argument, it is tail called and applied to itself.
Which leads again to itself being tail called and applied to
itself. We have a situation of infinite regression: the output
of (δ δ)
ends up being a restatement of the
output of (δ δ)
. Now in this case, regression
is infinite and there is no base case, but imagine that somehow
there were a base case, or that somehow we identified a value that
an infinite regression equated to. Then each stage of the infinite
regression just replicates the previous stage exactly. It is like
having a perfectly silvered mirror: it just replicates the image
presented to it exactly. By calling delta on delta, we've arranged our perfectly silvered
mirror to reflect an image of itself. This leads to the “infinite
hall of mirrors” effect.
So let's tweak the delta function so that instead of perfectly
replicating the infinite regression, it applies a
function g
around the replication: (lambda (f) (g (f
f)))
. If we apply this modified delta function to itself,
each expansion of the infinite regression ends up wrapping an
application of the g
around it: (g (f f)) = (g (g
(f f))) = (g (g (g (f f)))) = (g (g (g (g … ))))
. So
our modified delta function gives us a nested infinite regression of
applications of g
. This is like our perfectly silvered
mirror, but now the reflected image isn't mirrored exactly: we've
put a frame on the mirror. When we arrange for the mirror to
reflect itself, each nested reflection also has an image of the
frame around the reflection, so we get a set of infinitely nested
frames.
An infinite regression of (g (g (g (g … ))))
is
confusing. What does it mean? We can untangle this by unwrapping
an application. (g (g (g (g … ))))
is just a
call to g
. The argument to that call is weird, but
we're just calling (g <something>)
. The
result of the infinite regression (g (g (g (g …
))))
is simply the result of the outermost call
to g
. We can use this to build a recursive function.
;; If factorial = (g (g (g (g … )))), then ;; factorial = (g factorial), where (defun g (factorial) (lambda (x) (if (zerop x) 1 (* x (funcall factorial (- x 1))))))The value returned by an inner invocation of
g
is the value that will be
funcalled in the altenative branch of the conditional.
Y is defined thus:
Y = λg.(λf.g(f f))(λf.g(f f))
A straightforward implementation attempt would be;; Non working y operator (defun y (g) (let ((d (lambda (f) (funcall g (funcall f f))))) (funcall d d)))but since lisp is a call-by-value language, it will attempt to
(funcall f f)
before funcalling g
, and this will cause runaway
recursion. We can avoid the runaway recursion by delaying
the (funcall f f)
with a strategically placed
thunk;; Call-by-value y operator ;; returns (g (lambda () (g (lambda () (g (lambda () … )))))) (defun y (g) (let ((d (lambda (f) (funcall g (lambda () (funcall f f)))))) (funcall d d)))Since the recursion is now wrapped in a thunk, we have to funcall the thunk to force the recursive call. Here is an example where we see that:
* (funcall (Y (lambda (thunk) (lambda (x) (if (zerop x) 1 (* x (funcall (funcall thunk) (- x 1))))))) 6) 720the
(funcall thunk)
invokes the thunk in order to get the
actual recursive function, which we when then funcall
on (- x 1)
.
By wrapping the self-application with a thunk, we've made the call site where we use the thunk more complicated. We can clean that up by wrapping the call to the thunk in something nicer:
* (funcall (y (lambda (thunk) (flet ((factorial (&rest args) (apply (funcall thunk) args))) (lambda (x) (if (zerop x) 1 (* x (factorial (- x 1)))))))) 6) 720And we can even go so far as to hoist that wrapper back up into the definiton of y
(defun y1 (g) (let ((d (lambda (f) (funcall g (lambda (&rest args) (apply (funcall f f) args)))))) (funcall d d))) * (funcall (y1 (lambda (factorial) (lambda (x) (if (zerop x) 1 (* x (funcall factorial x)))))) 6) 720
y1
is an alternative formulation of the Y
operator where we've η-expanded the recursive call to avoid the
runaway recursion.
The η-expanded version of the applicative order Y operator has the advantage that it is convenient for defining recursive functions. The thunkified version is less convenient because you have to force the thunk before using it, but it allows you to use the Y operator to define recursive data structures as well as functions:
(Y (lambda (delayed-ones) (cons-stream 1 (delayed-ones)))) {1 …}
The argument to the thunkified Y operator is itself a procedure of one argument, the thunk. Y returns the result of calling its argument. Y should return a procedure, so the argument to Y should return a procedure. But it doesn't have to immediately return a procedure, it just has to eventually return a procedure, so we could, for example, print something before returning the procedure:
* (funcall (Y (lambda (thunk) (format t "~%Returning a procedure") (lambda (x) (if (zerop x) 1 (* x (funcall (funcall thunk) (- x 1))))))) 6) Returning a procedure Returning a procedure Returning a procedure Returning a procedure Returning a procedure Returning a procedure 720There is one caveat. You must be able to return the procedure without attempting to make the recursive call.
Let's transform the returned function before returning it by applying an arbitrary function h to it:
(Y (lambda (thunk) (h (lambda (x) (if (zerop x) 1 … )))))Ok, so now when we
(funcall thunk)
we don't get what we want, we've got
an invocation of h around it. If we have an inverse to h,
h-1, available, we can undo it:(y (lambda (thunk) (h (lambda (x) (if (zerop x) 1 (* (funcall (h-1 (funcall thunk)) (- x 1))))))))As a concrete example, we return a list and at the call site we extract the first element of that list before calling it:
* (funcall (car (y (lambda (thunk) (list (lambda (x) (if (zerop x) 1 (* x (funcall (car (funcall thunk)) (- x 1)))))))) 6) 720So we can return a list of mutually recursive functions:
(y (lambda (thunk) (list ;; even? (lambda (n) (or (zerop n) (funcall (cadr (funcall thunk)) (- n 1)))) ;; odd? (lambda (n) (and (not (zerop n)) (funcall (car (funcall thunk)) (- n 1)))) )))If we use the η-expanded version of the Y operator, then we can adapt it to expect a list of mutually recursive functions on the recursive call:
(defun y* (&rest g-list) (let ((d (lambda (f) (map 'list (lambda (g) (lambda (&rest args) (apply (apply g (funcall f f)) args))) g-list)))) (funcall d d)))which we could use like this:
* (let ((eo (y* (lambda (even? odd?) (declare (ignore even?)) (lambda (n) (or (zerop n) (funcall odd? (- n 1))))) (lambda (even? odd?) (declare (ignore odd?)) (lambda (n) (and (not (zerop n)) (funcall even? (- n 1)))))))) (let ((even? (car eo)) (odd? (cadr eo))) (do ((i 0 (+ i 1))) ((>= i 5)) (format t "~%~d, ~s ~s" i (funcall even? i) (funcall odd? i))))))) 0, T NIL 1, NIL T 2, T NIL 3, NIL T 4, T NILInstead of returning a list of mutually recursive functions, we could return them as multiple values. We just have to be expecting multiple values at the call site:
(defun y* (&rest gs) (let ((d (lambda (f) (apply #'values (map 'list (lambda (g) (lambda (&rest args) (apply (multiple-value-call g (funcall f f)) args))) gs))))) (funcall d d)))
MIT Scheme used to have a construct called a named lambda. A named lambda has an extra first argument that is automatically filled in with the function itself. So during evaluation of the body of a named lambda, the name is bound to the named lambda, enabling the function to call itself recursively:
(defmacro named-lambda ((name &rest args) &body body) `(y1 (lambda (,name) (lambda ,args ,@body)))) * (funcall (named-lambda (factorial x) (if (zerop x) 1 (* x (funcall factorial (- x 1))))) 6) 720This leads us to named let expressions. In a named let, the implicit lambda that performs the let bindings is a named lambda. Using that name to invoke the lambda on a different set of arguments is like recursively re-doing the let.
* (named-let fact ((x 6)) (if (zerop x) 1 (* x (funcall fact (- x 1))))) 720
In Scheme, you use letrec
to define recursive or mutually recursive procedures. Internal definitions expand into
an appropriate letrec
. letrec
achieves the necessary circularity not through the Y operator,
but through side effects. It is hard to tell the difference, but there is a difference. Using the Y operator would allow
you to have recursion, but avoid the implicit side effects in a letrec
.
Oleg Kiselyov has more to say about the Y operator at http://okmij.org/ftp/Computation/fixed-point-combinators.html