Here is the delta function: `δ = (lambda (f) (f f))`

. Delta
takes a function and tail calls that function on itself. What
happens if we apply the delta function to itself? Since the delta
function is the argument, it is tail called and applied to itself.
Which leads again to itself being tail called and applied to
itself. We have a situation of infinite regression: the output
of `(δ δ)`

ends up being a restatement of the
output of `(δ δ)`

. Now in this case, regression
is infinite and there is no base case, but imagine that somehow
there were a base case, or that somehow we identified a value that
an infinite regression equated to. Then each stage of the infinite
regression just replicates the previous stage exactly. It is like
having a perfectly silvered mirror: it just replicates the image
presented to it exactly. By calling delta on delta, we've arranged our perfectly silvered
mirror to reflect an image of itself. This leads to the “infinite
hall of mirrors” effect.

So let's tweak the delta function so that instead of perfectly
replicating the infinite regression, it applies a
function `g`

around the replication: ```
(lambda (f) (g (f
f)))
```

. If we apply this modified delta function to itself,
each expansion of the infinite regression ends up wrapping an
application of the `g`

around it: ```
(g (f f)) = (g (g
(f f))) = (g (g (g (f f)))) = (g (g (g (g … ))))
```

. So
our modified delta function gives us a nested infinite regression of
applications of `g`

. This is like our perfectly silvered
mirror, but now the reflected image isn't mirrored exactly: we've
put a frame on the mirror. When we arrange for the mirror to
reflect itself, each nested reflection also has an image of the
frame around the reflection, so we get a set of infinitely nested
frames.

An infinite regression of `(g (g (g (g … ))))`

is
confusing. What does it mean? We can untangle this by unwrapping
an application. `(g (g (g (g … ))))`

is just a
call to `g`

. The argument to that call is weird, but
we're just calling `(g `*<something>*)

. The
result of the infinite regression ```
(g (g (g (g …
))))
```

is simply the result of the outermost call
to `g`

. We can use this to build a recursive function.

;; If factorial = (g (g (g (g … )))), then
;; factorial = (g factorial), where
(defun g (factorial)
(lambda (x)
(if (zerop x)
1
(* x (funcall factorial (- x 1))))))

The value returned
by an inner invocation of

`g`

is the value that will be
funcalled in the altenative branch of the conditional.

Y is defined thus:

Y = λg.(λf.g(f f))(λf.g(f f))

A straightforward implementation attempt would be

;; Non working y operator
(defun y (g)
(let ((d (lambda (f) (funcall g (funcall f f)))))
(funcall d d)))

but since lisp is a
call-by-value language, it will attempt to

`(funcall f f)`

before funcalling

`g`

, and this will cause runaway
recursion. We can avoid the runaway recursion by delaying
the

`(funcall f f)`

with a strategically placed
thunk

;; Call-by-value y operator
;; returns (g (lambda () (g (lambda () (g (lambda () … ))))))
(defun y (g)
(let ((d (lambda (f) (funcall g (lambda () (funcall f f))))))
(funcall d d)))

Since the recursion is
now wrapped in a thunk, we have to funcall the thunk to force the
recursive call. Here is an example where we see that:

* (funcall (Y (lambda (thunk)
(lambda (x)
(if (zerop x)
1
(* x (funcall (funcall thunk) (- x 1)))))))
6)
720

the

`(funcall thunk)`

invokes the thunk in order to get the
actual recursive function, which we when then

`funcall`

on

`(- x 1)`

.

By wrapping the self-application with a thunk, we've made the call
site where we use the thunk more complicated. We can clean that up
by wrapping the call to the thunk in something nicer:

* (funcall
(y (lambda (thunk)
(flet ((factorial (&rest args)
(apply (funcall thunk) args)))
(lambda (x)
(if (zerop x)
1
(* x (factorial (- x 1))))))))
6)
720

And we can even go so far as to hoist that wrapper back up
into the definiton of y

(defun y1 (g)
(let ((d (lambda (f) (funcall g (lambda (&rest args) (apply (funcall f f) args))))))
(funcall d d)))
* (funcall
(y1 (lambda (factorial)
(lambda (x)
(if (zerop x)
1
(* x (funcall factorial x))))))
6)
720

`y1`

is an alternative formulation of the Y
operator where we've η-expanded the recursive call to avoid the
runaway recursion.

The η-expanded version of the applicative order Y operator has
the advantage that it is convenient for defining recursive functions. The
thunkified version is less convenient because you have to force the
thunk before using it, but it allows you to use the Y operator to
define recursive data structures as well as functions:

(Y
(lambda (delayed-ones)
(cons-stream 1 (delayed-ones))))
{1 …}

The argument to the thunkified Y operator is itself a procedure of
one argument, the thunk. Y returns the result of calling its
argument. Y should return a procedure, so the argument to Y should
return a procedure. But it doesn't have to *immediately*
return a procedure, it just has to *eventually* return a
procedure, so we could, for example, print something before returning
the procedure:

* (funcall (Y (lambda (thunk)
(format t "~%Returning a procedure")
(lambda (x)
(if (zerop x)
1
(* x (funcall (funcall thunk) (- x 1)))))))
6)
Returning a procedure
Returning a procedure
Returning a procedure
Returning a procedure
Returning a procedure
Returning a procedure
720

There is one caveat. You must be able to return the
procedure without attempting to make the recursive call.

Let's transform the returned
function before returning it by applying an arbitrary function h to
it:

(Y (lambda (thunk)
(h (lambda (x)
(if (zerop x)
1
… )))))

Ok, so now when
we

`(funcall thunk)`

we don't get what we want, we've got
an invocation of h around it. If we have an inverse to h,
h

^{-1}, available, we can undo it:

(y (lambda (thunk)
(h (lambda (x)
(if (zerop x)
1
(* (funcall (h^{-1} (funcall thunk)) (- x 1))))))))

As a concrete example, we return a list and at the call site we
extract the first element of that list before calling it:

* (funcall (car (y (lambda (thunk)
(list (lambda (x)
(if (zerop x)
1
(* x (funcall (car (funcall thunk)) (- x 1))))))))
6)
720

So we can return a list of mutually recursive
functions:

(y (lambda (thunk)
(list
;; even?
(lambda (n)
(or (zerop n)
(funcall (cadr (funcall thunk)) (- n 1))))
;; odd?
(lambda (n)
(and (not (zerop n))
(funcall (car (funcall thunk)) (- n 1))))
)))

If we use the η-expanded version of the Y operator,
then we can adapt it to expect a list of mutually recursive functions
on the recursive call:

(defun y* (&rest g-list)
(let ((d (lambda (f)
(map 'list (lambda (g)
(lambda (&rest args)
(apply (apply g (funcall f f)) args)))
g-list))))
(funcall d d)))

which we could use like this:

* (let ((eo (y* (lambda (even? odd?)
(declare (ignore even?))
(lambda (n)
(or (zerop n)
(funcall odd? (- n 1)))))
(lambda (even? odd?)
(declare (ignore odd?))
(lambda (n)
(and (not (zerop n))
(funcall even? (- n 1))))))))
(let ((even? (car eo))
(odd? (cadr eo)))
(do ((i 0 (+ i 1)))
((>= i 5))
(format t "~%~d, ~s ~s"
i
(funcall even? i)
(funcall odd? i)))))))
0, T NIL
1, NIL T
2, T NIL
3, NIL T
4, T NIL

Instead of returning a list of mutually recursive functions, we could return them as multiple values.
We just have to be expecting multiple values at the call site:

(defun y* (&rest gs)
(let ((d (lambda (f)
(apply #'values
(map 'list
(lambda (g)
(lambda (&rest args)
(apply (multiple-value-call g (funcall f f)) args)))
gs)))))
(funcall d d)))

MIT Scheme used to have a construct called a named lambda. A named
lambda has an extra first argument that is automatically filled in
with the function itself. So during evaluation of the body of a
named lambda, the name is bound to the named lambda, enabling the
function to call itself recursively:

(defmacro named-lambda ((name &rest args) &body body)
`(y1 (lambda (,name)
(lambda ,args
,@body))))
* (funcall (named-lambda (factorial x)
(if (zerop x)
1
(* x (funcall factorial (- x 1)))))
6)
720

This leads us to named let expressions. In a named let,
the implicit lambda that performs the let
bindings is a named lambda. Using that name to invoke the lambda on a different set of arguments is
like recursively re-doing the let.

* (named-let fact ((x 6)) (if (zerop x) 1 (* x (funcall fact (- x 1)))))
720

In Scheme, you use `letrec`

to define recursive or mutually recursive procedures. Internal definitions expand into
an appropriate `letrec`

. `letrec`

achieves the necessary circularity not through the Y operator,
but through side effects. It is hard to tell the difference, but there is a difference. Using the Y operator would allow
you to have recursion, but avoid the implicit side effects in a `letrec`

.

Oleg Kiselyov has more to say about the Y operator at http://okmij.org/ftp/Computation/fixed-point-combinators.html