Saturday, May 15, 2021

β-conversion

If you have an expression that is an application, and the operator of the application is a lambda expression, then you can β-reduce the application by substituting the arguments of the application for the bound variables of the lambda within the body of the lambda.

(defun beta (expression if-reduced if-not-reduced)
  (if (application? expression)
      (let ((operator (application-operator expression))
            (operands (application-operands expression)))
        (if (lambda? operator)
            (let ((bound-variables (lambda-bound-variables operator))
                  (body (lambda-body operator)))
              (if (same-length? bound-variables operands)
                  (funcall if-reduced
                           (xsubst body
                                   (table/extend* (table/empty)
                                                  bound-variables
                                                  operands)))
                  (funcall if-not-reduced)))
            (funcall if-not-reduced)))
      (funcall if-not-reduced)))

* (beta '((lambda (x y) (lambda (z) (* x y z))) a (+ z 3))
        #'identity (constantly nil))
(LAMBDA (#:Z460) (* A (+ Z 3) #:Z460))

A large, complex expression may or may not have subexpressions that can be β-reduced. If neither an expression or any of its subexpressions can be β-reduced, then we say the expression is in “β-normal form”. We may be able to reduce an expression to β-normal form by β-reducing where possible. A β-reduction can introduce further reducible expressions if we substitute a lambda expression for a symbol in operator position, so reducing to β-normal form is an iterative process where we continue to reduce any reducible expressions that arise from substitution.

(defun beta-normalize-step (expression)
  (expression-dispatch expression

    ;; case application
    (lambda (subexpressions)
      ;; Normal order reduction
      ;; First, try to beta reduce the outermost application,
      ;; otherwise, recursively descend the subexpressions, working
      ;; from left to right.
      (beta expression
            #'identity
            (lambda ()
              (labels ((l (subexpressions)
                         (if (null subexpressions)
                             '()
                             (let ((new-sub (beta-normalize-step (car subexpressions))))
                               (if (eq new-sub (car subexpressions))
                                   (let ((new-tail (l (cdr subexpressions))))
                                     (if (eq new-tail (cdr subexpressions))
                                         subexpressions
                                         (cons (car subexpressions) new-tail)))
                                   (cons new-sub (cdr subexpressions)))))))
                (let ((new-subexpressions (l subexpressions)))
                  (if (eq new-subexpressions subexpressions)
                      expression
                      (make-application new-subexpressions)))))))

    ;; case lambda
    (lambda (bound-variables body)
      (let ((new-body (beta-normalize-step body)))
        (if (eql new-body body)
            expression
            (make-lambda bound-variables new-body))))

    ;; case symbol
    (constantly expression)))

;;; A normalized expression is a fixed point of the
;;; beta-normalize-step function.
(defun beta-normalize (expression)
  (do ((expression expression (beta-normalize-step expression))
       (expression1 '() expression)
       (count 0 (+ count 1)))
      ((eq expression expression1)
       (format t "~%~d beta reductions" (- count 1))
       expression)))

You can compute just by using β-reduction. Here we construct an expression that reduces to the factorial of 3. We only have β-reduction, so we have to encode numbers with Church encoding.

(defun test-form ()
  (let ((table
         (table/extend*
          (table/empty)
          '(one
            three
            *
            pred
            zero?
            y)
          '(
            ;; Church numeral one
            (lambda (f) (lambda (x) (f x)))

            ;; Church numeral three 
            (lambda (f) (lambda (x) (f (f (f x)))))

            ;; * (multiply Church numerals)
            (lambda (m n)
              (lambda (f)
                (m (n f))))

            ;; pred (subtract 1 from Church numeral)
            (lambda (n)
              (lambda (f)
                (lambda (x) (((n (lambda (g)
                                   (lambda (h)
                                     (h (g f)))))
                              (lambda (u) x))
                             (lambda (u) u)))))

            ;; zero? (test if Church numeral is zero)
            (lambda (n t f) ((n (lambda (x) f)) t))

            ;; Y operator for recursion
            (lambda (f)
              ((lambda (x) (f (x x)))
               (lambda (x) (f (x x)))))

            )))
        (expr
         '((lambda (factorial)
             (factorial three))
           (y (lambda (fact)
                (lambda (x)
                  (zero? x
                   one
                   (* (fact (pred x)) x))))))))
    (xsubst expr table)))

* (test-form)

((LAMBDA (FACTORIAL) (FACTORIAL (LAMBDA (F) (LAMBDA (X) (F (F (F X)))))))
 ((LAMBDA (F) ((LAMBDA (X) (F (X X))) (LAMBDA (X) (F (X X)))))
  (LAMBDA (FACT)
    (LAMBDA (X)
      ((LAMBDA (N T F) ((N (LAMBDA (X) F)) T)) X
       (LAMBDA (F) (LAMBDA (X) (F X)))
       ((LAMBDA (M N) (LAMBDA (F) (M (N F))))
        (FACT
         ((LAMBDA (N)
            (LAMBDA (F)
              (LAMBDA (X)
                (((N (LAMBDA (G) (LAMBDA (H) (H (G F))))) (LAMBDA (U) X))
                 (LAMBDA (U) U)))))
          X))
        X))))))

* (beta-normalize (test-form))

127 beta reductions
(LAMBDA (F) (LAMBDA (X) (F (F (F (F (F (F X))))))))

This is the Church numeral for 6.

I find it pretty amazing that we can bootstrap ourselves up to arithmetic just by repeatedly β-reducing where we can. It doesn't seem like we're actually doing any work. We're just replacing names with what they stand for.

The β-substitution above replaces all the bound variables with their arguments if there is the correct number of arguments. One could easily implement a partial β-substitution that replaced only some of the bound variables. You'd still have an application, but some of the bound variables in the lambda would be eliminated and the corresponding argument would be removed.

Monday, May 10, 2021

Substitution

In McCarthy's early papers on Lisp, he notes that he needs a modified version of subst which needs to be aware of quoted expressions (and avoid substituting within them). He would also need a subst that was aware of lambda expressions. It would have to avoid substituting within the lambda if the name substituted matches one of the bound variables. To be useful for evaluation, it will have to deal with accidental variable capture when substituting within a lambda.

The root problem is that expressions are actually structured objects, but we are working with the list representation of those objects. Instead of substituting by operating on objects, we substitute on the list representation. We have to arrange for the syntactic substitution on the list representation to preserve the semantics of substitution on the objects they represent.

In the substitution model, we take a symbolic expression and replace some of the atoms in the expression with other expressions. We first need a way to discriminate between the different kinds of expressions. An expression is either an atomic symbol, or a list of expressions called an application. There are no other kinds of expressions.

(defun expression-dispatch (expression if-symbol if-application)
  (cond ((symbolp expression) (funcall if-symbol expression))
        ((consp expression)   (funcall if-application expression))
        (t (error "~s is not an expression." expression))))
Substitution is straightforward:
(defun xsubst (table expression)
  (expression-dispatch expression
    (lambda (symbol)
      (funcall table symbol #'identity (constantly symbol)))

    (lambda (subexpressions)
      (map 'list (lambda (subexpression) (xsubst table subexpression)) subexpressions))))

* (let ((table (table/extend (table/empty) 'x '(* a 42))))
    (xsubst table '(+ x y)))  
(+ (* A 42) Y)
We need a table of multiple substitutions so that we can substitute in parallel:
* (let ((table (table/extend
                 (table/extend (table/empty) 'x 'y)
                 'y 'x)))
    (xsubst table '(+ x y)))
(+ Y X)

So far, so good. Let's add lambda expressions. First, we need to add a new expression kind:

(defun expression-dispatch (expression if-symbol if-lambda if-application) 
  (cond ((symbolp expression) (funcall if-symbol expression))
        ((consp expression)
         (cond ((eq (car expression) 'lambda)
                (funcall if-lambda (cadr expression) (caddr expression)))
               (t (funcall if-application expression))))
        (t (error "~s is not an expression." expression))))

Substitution within a lambda expression is a bit tricky. First, you don't want to substitute a symbol if it is one of the bound variables of the lambda expression. Second, substituting a symbol may introduce more symbols. We don't want the new symbols to be accidentally captured by the bound variables in the lambda. If any new symbol has the same name as a bound variable, we have to rename the bound variable (and all its occurrances) to a fresh name so that it doesn't capture the new symbol being introduced. We'll need a helper function

(defun free-variables (expression)
   (expression-dispatch expression
     (lambda (symbol) (list symbol))
     (lambda (bound-variables body)
       (set-difference (free-variables body) bound-variables))
     (lambda (subexpressions)
       (fold-left #'union '() (map 'list #'free-variables subexpressions)))))
Now when we substitute within a lambda, we first find each free variable in the lambda, look it up in the substitution table, and collect the free variables of the substituted value:
(map 'list (lambda (var)
             (funcall table var #'free-variables (constantly '())))
      (free-variables expression))
This gives us the new free variables for each substitution. The union of all of these is the set of all the new free variables
(fold-left #'union '()
           (map 'list (lambda (var)
                        (funcall table var #'free-variables (constantly '())))
                 (free-variables expression)))
We have to rename the bound variables that are in this set:
 
(intersection 
  bound-variables
  (fold-left #'union '()
             (map 'list (lambda (var)
                          (funcall table var #'free-variables (constantly '())))
                  (free-variables expression))))
So we make a little table for renaming:
(defun make-alpha-table (variables)
  (fold-left (lambda (table variable)
               (table/extend table variable (gensym (symbol-name variable))))
             (table/empty)
             variables))

(let ((alpha-table
       (make-alpha-table
        (intersection 
          bound-variables
          (fold-left #'union '()
                     (map 'list (lambda (var)
                                  (funcall table var #'free-variables (constantly '())))
                          (free-variables expression)))))))
  …)
We rename the bound variables as necessary:
  (make-lambda
    (map 'list (lambda (symbol)
                 (funcall alpha-table symbol #'identity (constantly symbol)))
         bound-variables)
    …)
Finally, we redact the bound variables from the substitution table and append the alpha-table to make the substitutions we need for the lambda body
  (make-lambda
   (map 'list (lambda (symbol)
                (funcall alpha-table symbol #'identity (constantly symbol)))
        bound-variables)
   (xsubst (table/append alpha-table (table/redact* table bound-variables))
           body))))
The entire definition of xsubst is now this:
(defun xsubst (table expression)
  (expression-dispatch expression
    (lambda (symbol)
      (funcall table symbol #'identity (constantly symbol)))

    (lambda (bound-variables body)
      (let ((alpha-table
             (make-alpha-table
              (intersection
               bound-variables
               (fold-left #'union '()
                          (map 'list (lambda (var)
                                       (funcall table var
                                                #'free-variables
                                                (constantly '())))
                               (set-difference (free-variables body) bound-variables)))))))
        (make-lambda
         (map 'list (lambda (symbol)
                      (funcall alpha-table symbol #'identity (constantly symbol)))
              bound-variables)
         (xsubst (table/append alpha-table (table/redact* table bound-variables))
                 body))))

    (lambda (subexpressions)
      (make-application
       (map 'list (lambda (subexpression)
                    (xsubst table subexpression))
            subexpressions)))))
This is certainly more complicated than simple substitution, but we can see it does the right thing here:
* (xsubst (table/extend (table/empty) 'x '(* a y)) '(lambda (y) (+ x y)))
(LAMBDA (#:Y234) (+ (* A Y) #:Y234))

It should be obvious how to add quoted forms. This would require adding a new kind of expression to expression-dispatch and a new handling clause in xsubst that avoids substitution.

I'm not completely happy with how we've added lambda expressions to the expression syntax. Using the symbol lambda as a syntactic marker for lambda expressions causes problems if we also want to use that symbol as an argument or variable. Initially, it seems reasonable to be able to name an argument “lambda”. Within the body of the function, references to the variable lambda would refer to that argument. But what about references in the operator position? By defining lambda expressions as three element lists beginning with the symbol lambda we've made it ambiguous with two-argument applications whose operator is the variable lambda. We have to resolve this ambiguity. The current behavior is that we always interpret the symbol lambda as a syntactic marker so you simply cannot use a variable named lambda as a function.

Monday, May 3, 2021

Lightweight table

You don't need a data structure to make a lookup table. You can make a table just out of the lookup function. In this example, we start with a continuation passing style lookup function:

lookup (key if-found if-not-found)
  
    Invokes (funcall if-found value) if key is in the table,
    invokes (funcall if-not-found) otherwise.
An empty table just invokes the if-not-found continuation:
(defun table/empty ()
  (lambda (key if-found if-not-found)
    (declare (ignore key if-found))
    (funcall if-not-found)))
A table can be extended by wrapping it:
(defun table/extend (table key* value)
  (lambda (key if-found if-not-found)
    (if (eql key key*)
        (funcall if-found value)
        (funcall table key if-found if-not-found))))
So let's try it out:
(defvar *table-1* (table/extend 
                    (table/extend
                      (table/empty)
                      'foo 42)
                    'bar 69))

* (funcall *table-1* 'foo #'identity (constantly 'not-found))
42

* (funcall *table-1* 'quux #'identity (constantly 'not-found))
NOT-FOUND
You can also redact an entry from a table by wrapping the table:
(defun table/redact (table redacted)
  (lambda (key if-found if-not-found)
    (if (eql key redacted)
        (funcall if-not-found)
        (funcall table key if-found if-not-found))))

(defvar *table-2* (table/redact *table-1* 'foo))

* (funcall *table-2* 'foo #'identity (constantly 'not-found))
NOT-FOUND

Are there any advantages to implementing a table in this curious manner? Building a table by nesting a series of lookup steps leads to a linear lookup in linear space, so this kind of table should be more or less comparable to an alist for individual entries. Unlike a traditional table made with a data structure, you cannot enumerate the keys and values in the table. On the other hand, you gain the ability to map keys to values without having to enumerate the keys:

(defun table/bind-predicate (table predicate value)
  (lambda (key if-found if-not-found)
    (if (funcall predicate key)
        (funcall if-found value)
        (funcall table key if-found if-not-found))))

;;; bind all even numbers to the symbol 'EVEN
(defvar *table-3* 
  (table/bind-predicate *table-2* (lambda (n) (and (numberp n) (evenp n))) 'even))

* (funcall *table-3* 6 #'identity (constantly 'not-found))
EVEN
Or you can add a default value to an existing table:
(defun table/add-default (table default-value)
  (lambda (key if-found if-not-found)
    (declare (ignore if-not-found))
    (funcall table key
      if-found
      (lambda () (funcall if-found default-value)))))

(defvar *table-4* (table/add-default *table-3* 'default))

* (funcall *table-4* 'bar #'identity (constantly 'not-found))
69
    
* (funcall *table-4* 'xyzzy #'identity (constantly 'not-found))
DEFAULT

Perhaps the biggest disadvantage of this implementation is the difficulty in inspecting a table.

* *table-4*
#<CLOSURE (LAMBDA (KEY IF-FOUND IF-NOT-FOUND) :IN TABLE/ADD-DEFAULT) {1003CD641B}>
We can use the object inspector to peek inside the closure and maybe sleuth out what this table is made out of, but it isn't just an alist where we can print out the entries.

So far, we've defined a table as being a procedure with the (key if-found if-not-found) signature, but we can flip this around and say that any procedure with a (key if-found if-not-found) signature can be thought of as a table. For example, a regular expression matcher could be considered to be a table of strings (if that were a more useful model).