Wednesday, February 19, 2020

Stupid pattern matching tricks

There are a few pattern matching constructs in Common Lisp. For instance, destructuring-bind matches list structure against a tree of variable names and binds the variables accordingly. Macros can destructure their argument list. Even functions have simple keyword matching. These constructs don't give access to their pattern matchers as first-class objects, but perhaps you want that. You can construct a simple pattern matcher by wrapping one of these constructs in the appropriate macro.

We'll want the result of our pattern match to be an alist mapping symbols to the objects they matched with. First, we'll need a function that takes a pattern and returns a list of the variables in the pattern. flatten will work nicely for destructuring-bind:
(defun flatten (pattern)
  (cond ((null pattern) '())
 ((symbolp pattern) (list pattern))
 ((consp pattern) (append (flatten (car pattern))
     (flatten (cdr pattern))))
 (t (error "Not a pattern"))))

CL-USER> (flatten '((a b . c) d e . f))
(A B C D E F)
Then we want to generate code that will make an alist:
CL-USER> `(list ,@(map 'list (lambda (var)
           `(cons ',var ,var))
               (flatten '((a b . c) d e . f))))
(LIST (CONS 'A A) (CONS 'B B) (CONS 'C C) (CONS 'D D) (CONS 'E E) (CONS 'F F))
Finally, we wrap a call to destructuring-bind with a macro:
CL-USER> (defmacro destructuring-pattern-matcher (pattern)
           `(lambda (form)
              (destructuring-bind ,pattern form
                (list ,@(map 'list (lambda (var)
                              `(cons ',var ,var))
                     (flatten pattern))))))
DESTRUCTURING-PATTERN-MATCHER

CL-USER> (destructuring-pattern-matcher ((a b . c) d e . f))
#<FUNCTION (LAMBDA (FORM)) {10027B143B}>
destructuring-pattern-matcher returns a pattern matcher as a first-class procedure we can call on a pattern to get an alist of bindings:
CL-USER> (defvar *matcher* (destructuring-pattern-matcher ((a b . c) d e . f)))
*MATCHER*

CL-USER> (funcall *matcher* '((1 2 3 4) 5 6 7 8))
((A . 1) (B . 2) (C 3 4) (D . 5) (E . 6) (F 7 8))

We can use this trick to get at the destructuring pattern match done by defmacro. First, we need a function that takes a macro lambda list and returns a list of the variables it binds. I won't reproduce the function here, it is too large, but here's a sample call:
CL-USER> (macro-lambda-list-variables 
            '((foo bar &optional (baz 'default baz-supplied-p) . more) quux
              &rest rest
              &key ((:key key-variable) 'key-default key-supplied-p) key2
              &aux (auxvar 'auxvalue)))
(FOO BAR BAZ BAZ-SUPPLIED-P MORE QUUX REST KEY-VARIABLE KEY-SUPPLED-P KEY2 AUXVAR)
If we were matching the list '(1 e) against the pattern (a b &optional c), we'd want to generate code something like this:
(MACROLET ((MACRO (A B &OPTIONAL C)
             (LIST 'LIST
     (LIST 'CONS ''A (LIST 'QUOTE A))
     (LIST 'CONS ''B (LIST 'QUOTE B))
                   (LIST 'CONS ''C (LIST 'QUOTE C)))))
  (MACRO 1 E))
We'll do this in stages:
(defun make-macro-pattern-matcher-body (pattern)
  `(list 
    'list
    ,@(map 'list (lambda (var)
     `(list 'cons '',var `',,var))
    (macro-lambda-list-variables pattern))))

(defun make-macro-pattern-matcher (pattern)
  (let ((body (make-macro-pattern-matcher-body pattern)))
    (lambda (form)
      `(macrolet ((macro ,pattern
      ,body))
  (macro ,@form)))))

(defmacro macro-pattern-matcher (pattern)
  (let ((matcher  (make-macro-pattern-matcher pattern)))
    `(lambda (form)
       (eval (funcall ',matcher form)))))
Now we can make a pattern matcher that works like the macro destructuring facility:
CL-USER> (setq *matcher* 
            (macro-pattern-matcher 
       ((foo bar &optional (baz 'default baz-supplied-p) . more) quux
               &rest rest
               &key ((:key key-variable) 'key-default key-supplied-p) key2
               &aux (auxvar 'auxvalue))))
#<FUNCTION (LAMBDA (FORM)) {10027B1D3B}>

CL-USER> (funcall *matcher* '((1 2 3 4) 5 :key 6 :key2 7))
((FOO . 1)
 (BAR . 2)
 (BAZ . 3)
 (BAZ-SUPPLIED-P . T)
 (MORE 4)
 (QUUX . 5)
 (REST :KEY 6 :KEY2 7)
 (KEY-VARIABLE . 6)
 (KEY-SUPPLIED-P . T)
 (KEY2 . 7)
 (AUXVAR . AUXVALUE))
You can do a similar trick with regular lambda lists, but while they have keywords, they don't destructure.

You have to be careful when writing the expansion for the binding alist. Too much quoting and you end up with the names rather than their values in the output:
((foo . foo)
 (bar . bar)
 …etc…)
not enough, you end up with the values of the values in the output:
CL-USER> (defvar e 22)
E

CL-USER> (funcall *matcher* '((1 2 e) 5))
((FOO . 1)
 (BAR . 2)
 (BAZ . 22) ; Wrong! Should be 'Eetc…)

Wednesday, February 12, 2020

A polygot program puzzle

Can you come up with an expression that evaluates to the symbol 'scheme in a Scheme system, but evaluates to the symbol 'common-lisp in a Common Lisp system?

Anaphoric if

An anaphoric if expression binds the identifier “it” to the value of the conditional in the scope of the consequent
(aif (find-broken-computer)
     (fix it))
I have two objections to anaphoric macros. The first is that the binding of “it” isn't obvious, the second is the inflexibility of the variable name. You are kind of stuck with “it”. What if you wanted to use “it” to mean something else? Maybe you have an IT department to fix your computers
(let ((it (find-department "IT")))
  (aif (find-broken-computer)
       (tell it (fix it))))   ; name conflict
or maybe you want to nest two anaphoric conditionals
(aif (find-broken-computer)
     (aif (find-broken-component it)
          (repair it)
          (replace it)))
In this case, I want to replace the computer if I cannot repair the broken component, but the inner binding of “it” shadows the outer binding and makes it inaccessible.

The solution is pretty obvious if you think about it (though sometimes it takes me a while to see the obvious). Just replace the conditional test form with a binding form
(aif (it (find-broken-computer))
     (fix it))
This makes it obvious we are binding the variable “it” to the value returned by (find-broken-computer), and it lets us choose the name we bind. If we want to nest these, it would look like this
(aif (computer (find-broken-computer))
     (aif (component (find-broken-component computer))
          (repair component)
          (replace computer)))
But I'm not sure if this is so much more concise and clear than simply using a let expression that it is worth adding this syntax to the language. It's one more thing the reader of the code has to be prepared to encounter.

A slightly different approach would move the binding form closer to where it is used. Note that there is no point in binding a name in the alternative clause to the conditional because it will always have the value nil.
(aif (find-broken-computer)
     (it (fix it)))
and instead of using a let binding, I could use a lambda binding
(aif (find-broken-computer)
     (λ (it) (fix it)))
aif no longer needs to be a macro but can be an ordinary function, which might be handy if your language doesn't have macros
(defun aif (it consequent &optional alternative)
  (if it
      (funcall consequent it)
      (if alternative
          (funcall alternative)
          nil)))

(aif (find-broken-computer)
     (λ (computer)
        (aif (find-broken-component computer)
             (λ (component) (fix component))
             (λ () (replace computer)))))
The explicit lambdas make it obvious what is being bound and what the scope of the binding is, but they do add a bit of visual noise.

Instead of using anaphoric if, I just write the slightly more verbose
(let ((computer (find-broken-computer)))
  (if computer
      (let ((component (find-broken-component)))
        (if component
            (repair component)
            (replace computer)))))
The binding is obvious, and I get to choose the variable being bound; both problems solved. I don't see a compelling reason to use the anaphoric version.

Addendum

Hexstream suggests that “No discussion of anaphoric macros can be complete without at least mentioning anaphoric-variants: https://www.hexstreamsoft.com/libraries/anaphoric-variants/” I wouldn't want to be incomplete, so consider it mentioned.

Tuesday, February 11, 2020

Macro pitfalls

Macros are a unique source of power in Common Lisp, but there are some pitfalls to watch out for.

A compiler macro is special macro that is expanded only by the compiler. The interpreter doesn't expand the macro and simply evaluates the form like a normal function call. If you aren't careful when writing a compiler macro, the interpreted and compiled forms may not evaluate the same and that's probably not what you want. Here we abuse this effect
(defun foo (x) 'interpreted)

(define-compiler-macro foo (x) ''compiled)

CL-USER> (foo)
INTERPRETED

CL-USER> ((lambda () (foo)))
COMPILED
That might be unexpected. It appears that in this implementation (SBCL) the compiler is called on lambda expressions when they are evaluated.

Like all macros, a compiler macro is given the unevaluated source code of the arguments, not the value. We can see that in this example
(defun foo (l r)
  (format t "~%Foo")
  (list r l))

(define-compile-macro foo (l r) 
  `(progn 
     (format t "~%Foo")
     (list ,r ,l)))

CL-USER> (foo (progn (format t "~%First") 'l) (progn (format t "~%Second") 'r))

First
Second
Foo
(r l)

CL-USER> ((lambda () (foo (progn (format t "~%First") 'l) (progn (format t "~%Second") 'r))))

Foo
Second
First
(r l)
When interpreted, the arguments are evaluated left to right before the function is entered. When compiled, the arguments end up being evaluated right to left and after the function is entered.

Unless you really want this — and shame on you if you do — you have to be careful when writing your macro to preserve the left-to-right, call-by-value semantics that are probably expected. The easiest way to do this is to write the expansion so that it just substitutes the body of the function. Something like this
(define-compiler-macro foo (l r)
  `((lambda (l r)
      (format t "~%Foo")
      (list r l))
    ,l
    ,r))

CL-USER> (foo (progn (format t "~%First") 'l) (progn (format t "~%Second") 'r))

First
Second
Foo
(r l)
Or you could use a let expression with the same effect
(define-compiler-macro foo (l r)
  `(let ((l ,l)
         (r ,r))
     (format t "~%Foo")
     (list r l)))
The version with the lambda expression doesn't even require putting a block of let bindings at the front. You just plop down the original argument list and body after the lambda, but both forms are equivalent.

The problem with doing this is that you have probably disabled the ability of the compiler to optimize the expression. You are forcing the compiler to ensure that the arguments are evaluated in left-to-right order before the body. A Sufficiently Smart compiler might be able to provide some optimizations anyway. If your compiler is not Sufficiently Smart, you can take matters in to your own hands and substitute the arguments at the point they are used. Just be aware that you might be surprising people by changing the semantics at the call site.

Funny semantics isn't just a problem with compiler macros. Regular macros have to be written with care as well or you may surprise users when they write code they think are normal function calls. Compiler macros just have the unique property that they can change the semantics between interpreted and compiled code.

You can see a related effect when using symbol macros. A symbol macro substitutes a piece of code that computes a value. If we write
CL-USER> (let ((l (progn (format t "~%First") 'l))
               (r (progn (format t "~%Second") 'r)))
           (format t "~%Let body")
           (list r l))

First
Second
Let body
(r l)
we get the standard left-to-right, call-by-value evaluation. But we can mimic normal-order reduction by substituting the code for l and r before evaluating the body of the let by use of symbol-macrolet*
CL-USER> (symbol-macrolet ((l (progn (format t "~%First") 'l))
                           (r (progn (format t "~%Second") 'r)))
           (format t "~%Symbol-macrolet body")
           (list r l))

Symbol-macrolet body
Second
First
(r l)
If one of the arguments to a macro is a block of code, for instance the &body argument, then you probably want to avoid accidental variable capture.
(defmacro capturing-macro (&body body)
  `(let ((temp 'captured))
     (format t "~%Macro body binds temp to ~S" temp)
     ,@body))

(let ((temp 'lexical))
  (capturing-macro
     (format t "~%Temp is ~s" temp)))

Macro body binds temp to CAPTURED
Temp is CAPTURED
NIL
The lexical binding of temp is shadowed by the binding introduced by capturing-macro. This is probably unintended (except in the case of anamorphic macros, where capture is intended). Instead, you can ensure lexical scoping is maintained by closing over the body before introducing any new bindings
(defmacro non-capturing-macro (&body body)
  `(let ((temp 'captured)
         (body (lambda () ,@body)))
     (format t "~%Macro body binds temp to ~S" temp)
     (funcall body)))

(let ((temp 'lexical))
  (non-capturing-macro
    (format t "~%Temp is ~s" temp)))

Macro body binds temp to CAPTURED
Temp is LEXICAL
NIL
In this case, even a fairly naive compiler ought to be able to inline the call to body because it is simply a lexically apparent code block.

Inadvertent capture can happen in other direction as well if the macro caller shadows a binding used by the macro.
(flet ((funcall (x) (format t "~%Unexpected")))
  (let ((temp 'lexical))
    (non-capturing-macro
      (list temp))))

Macro body binds temp to CAPTURED
Unexpected
NIL
Here the caller shadowed funcall and the code the macro introduced ends up inadvertently calling it. This doesn't happen often in practice because people rarely shadow the top-level functions a macro depends upon, and that is good because there isn't an easy way to solve this reverse capture problem (other than don't do that).

The “hygienic” macro system in Scheme solves both kinds of accidental capture by appropriately renaming variables. There is a price, however. You either have to forego direct code manipulation and use a special pattern matching language, or write code that explicitly keeps track of the environment where the variables are bound. For simple macros, the pattern matching language is adequate, but for more complex macros, neither option is appealing.

*Macrolet rhymes with Chevrolet, naturally.

Sunday, February 9, 2020

Four ways to use macros

The way I see it, there are about four five basic ways to use macros in Common Lisp.

First are macros that circumvent the regular call-by-value semantics. These might evaluate a subform at macro expansion time, treat a subform as a place (an l-value) rather than a value, or otherwise treat a subform as something other than a runtime function call. For example, if incf fully evaluated its argument, it could perform the increment on the value, but it couldn't put the value back where it got it. Another example is the check-type macro. You use it like this:
(defun foo (x)
  (check-type foo (integer 1 *) "a positive integer")
  (bar (- x 1)))
The check-type macro has to be a macro because it treats foo as a place (it will allow you to proceed by modifying foo), and it treats its second argument as a type specifier.

Second are macros that introduce new syntax to the language. Examples are cond, case, do, dotimes, defun, defvar, etc. These treat their arguments specially or have special clauses that don't act like ordinary function calls.
CL-USER> (macroexpand-all '(do ((i 0 (+ i 1))
                                (j 1 (* j 2)))
                               ((> j 65536) nil)
                             (format t "~%~2d ~5d" i j)))

(BLOCK NIL
  (COMMON-LISP:LET ((I 0) (J 1))
    (TAGBODY
      (GO #:G748)
     #:G747
      (TAGBODY (FORMAT T "~%~2d ~5d" I J))
      (COMMON-LISP:LET* ((#:NEW1 (+ I 1)) (#:NEW1 (* J 2)))
        (SETQ I #:NEW1)
        (SETQ J #:NEW1)
        NIL)
     #:G748
      (IF (> J 65536)
          NIL
          (GO #:G747))
      (RETURN-FROM NIL (PROGN NIL))))

Third are macros that implement tiny languages within Common Lisp. The loop macro is a good example. It looks like this
(loop for i from 1 to (compute-top-value)
      while (not (unacceptable i))
      collect (square i)
      do (format t "Working on ~D now" i)
      when (evenp i)
        do (format t "~D is a non-odd number" i) 
      finally (format t "About to exit!"))
It works like a little compiler. It parses the loop clauses and generates a Lisp form that carries them out
(BLOCK NIL
  (LET ((I 1) (#:LOOP-LIMIT-744 (COMPUTE-TOP-VALUE)))
    (DECLARE (TYPE (AND NUMBER REAL) #:LOOP-LIMIT-744)
             (TYPE (AND REAL NUMBER) I))
    (SB-LOOP::WITH-LOOP-LIST-COLLECTION-HEAD (#:LOOP-LIST-HEAD-745
                                              #:LOOP-LIST-TAIL-746)
      (TAGBODY
       SB-LOOP::NEXT-LOOP
        (WHEN (> I #:LOOP-LIMIT-744) (GO SB-LOOP::END-LOOP))
        (UNLESS (NOT (UNACCEPTABLE I)) (GO SB-LOOP::END-LOOP))
        (SB-LOOP::LOOP-COLLECT-RPLACD
         (#:LOOP-LIST-HEAD-745 #:LOOP-LIST-TAIL-746) (LIST (SQUARE I)))
        (FORMAT T "Working on ~D now" I)
        (IF (EVENP I)
            (FORMAT T "~D is a non-odd number" I))
        (SB-LOOP::LOOP-DESETQ I (1+ I))
        (GO SB-LOOP::NEXT-LOOP)
       SB-LOOP::END-LOOP
        (FORMAT T "About to exit!")
        (RETURN-FROM NIL
          (SB-LOOP::LOOP-COLLECT-ANSWER #:LOOP-LIST-HEAD-745)))))

Fourth are macros that run code in a special context. The with-… macros and when and unless fall in this category. These macros take ordinary Lisp code and wrap it with other code that establishes a context or tests a conditional.
CL-USER> (macroexpand '(when (condition) (foo) (bar)))

(IF (CONDITION)
    (PROGN (FOO) (BAR)))

CL-USER> (macroexpand '(with-open-file (foo "~/.bashrc" :if-does-not-exist :create)
                         (print (read-line foo))
                         (bar)))

(LET ((FOO (OPEN "~/.bashrc" :IF-DOES-NOT-EXIST :CREATE)) (#:G751 T))
  (UNWIND-PROTECT
      (MULTIPLE-VALUE-PROG1 (PROGN (PRINT (READ-LINE FOO)) (BAR)) (SETQ #:G751 NIL))
    (WHEN FOO (CLOSE FOO :ABORT #:G751))))

These aren't hard and fast categories, many macros can be thought of as in more than one category. All macros work by syntactic transformation and most treat at least one of their subforms as something other than a call-by-value form, for instance. There are also the occasional macros that have the look and feel of a standard function calls. The series package appears to allow you to manipulate series through standard function calls, but works by clever macroexpansion into iterative code.

I find it useful to think of macros in these four different ways, but I'm sure that others have their own categorizations that they find useful.

Addendum

An anonymous reader asked, “What about code walking/analysis/optimization?”. I really overlooked that. I think Richard Waters's series package would be a good example. It takes ordinary functional programs that can operate on series of data (coinductively defined data or codata), and turns it into the equivalent iterative construct that operates on one element at a time. It does this by clever macros that walk the code, analyse it, and rewrite it to a more optimal form
CL-USER> (sb-cltl2:macroexpand-all '(let ((x (scan-range :from 0 :below 10)))
                               (collect (choose-if #'evenp x))))

(COMMON-LISP:LET* ((X (COERCE (- 0 1) 'NUMBER))
                   (#:LASTCONS-751 (LIST NIL))
                   (#:LST-752 #:LASTCONS-751))
  (DECLARE (TYPE NUMBER X)
           (TYPE CONS #:LASTCONS-751)
           (TYPE LIST #:LST-752))
  (TAGBODY
   #:LL-756
    (SETQ X (+ X (COERCE 1 'NUMBER)))
    (IF (NOT (< X 10))
        (GO SERIES::END))
    (IF (NOT (EVENP X))
        (GO #:LL-756))
    (SETQ #:LASTCONS-751 (SB-KERNEL:%RPLACD #:LASTCONS-751 (CONS X NIL)))
    (GO #:LL-756)
   SERIES::END)
  (CDR #:LST-752)))
As you can see, the series code does a major rewrite of the original lisp code. An astute reader will notice that the let form has to have been redefined to do dataflow analysis of it's bindings and body. Thanks to anonymous for the suggestion.

Addendum 2: Symbol macros

A comment by Paul F. Dietz got me thinking about symbols and it occurred to me that symbol macros deserve their own category as well. Symbol macros appear to be an ordinary symbolic references to variables, but they expand to some code that computes a value. For instance, if foo were a symbol macro that expanded to (car bar), then using it in a form such as (+ foo 7) would expand to (+ (car bar) 7). A symbol macro is a two-edged sword. It is a very useful abstraction for providing a name to a computed value, but they also can fool the user of such a macro into thinking that a simple variable reference is happening when some more complex computation could be happening.

I think that makes seven ways and counting.

Thursday, February 6, 2020

Dispatching

There are times when you are faced with a complex piece of control flow
try {
    if (condition())
    {
        ... block 1 ...
    }
    else 
    {
        switch (someValue())
        {
          case CASE_A:
            ... block 2 ...
            break;

          case CASE_B:
            ... block 3 ...
            break;

          default:
            ... block 4 ...
            break;
        }
    }
} catch (SomeException someException) {
    ... block 5 ...
}
and you want to abstract the control flow — all the conditionals, switches, and try/catches — away from the code that does the work — the various blocks. In fact, here I've abstracted it away by putting "... block n ..." in place of the blocks of code.

If I were writing this in Scheme or Common Lisp, I'd consider using continuation passing style. I'd write a function dispatch-on-foo that would perform the dispatch, but then invoke one of several first-class procedures passed in as arguments
(defun dispatch-on-foo (foo bar case1 case2 case3 case4 case5)
   (if (... complex conditional ...) 
       (funcall case1)
       (handler-case (case some-value
                       ((case-a) (funcall case2))
                       ((case-b) (funcall case3))
                       (t (funcall case4)))
         (error (condition) (funcall case5)))))
At the call site, I'd write
(dispatch-on-foo <arg1> <arg2>
  (lambda ()
     ... block 1 ...)
  (lambda ()
     ... block 2 ...)
  (lambda ()
     ... block 3 ...)
  (lambda ()
     ... block 4 ...)
  (lambda ()
     ... block 5 ...))
This is a win when the complexity of the dispatch is enough that you don't want to replicate it at every call site. Notice how the nested blocks of code have been pulled up to the same level and linearized. Granted, you've cluttered up the call site with lambda expressions, but as Steele pointed out, you can think of these as anonymous go tags: dispatch-on-foo in essence will end up doing a jump to one of these tags and execute the block there, skipping and ignoring the other blocks. Once you get used to thinking in this way, the lambdas disappear just like the parens do for a seasoned Lisp hacker. They just look like jump targets or case labels, and the call site looks a lot like a case expression. It is a bit more powerful than an ordinary case expression because you could arrange for dispatch-on-foo to funcall the appropriate closure on an argument (and have the lambda expression take an argument of course).

You could do something analagous with Java 8's lambdas, but on the rare occasion I've wanted to do something similar in Java 7. The problem is that Java 7 doesn't have lambda expressions. The solution is to change these anonymous lambdas into named callback methods. First we define a generic interface with our callbacks:
interface DispatchOnFooCases<T> {
    T caseOne (void);
    T caseTwo (void);
    T caseThree (void);
    ... etc. ...
 }
then we define the dispatch method:
<T> T dispatchOnFoo (FooClass foo, BarClass bar, DispatchOnFooCases dispatchOnFooCases)
{
    try {
        if (conditional())
            return dispatchOnFooCases.caseOne();
        else
            switch (someValue()) {
              case CASE_A:
                return dispatchOnFooCases.caseTwo();

              case CASE_B:
                return dispatchOnFooCases.caseThree();

              default:
                return dispatchOnFooCases.caseFour();
            }
    } catch (SomeException someException) {
        return dispatchOnFooCases.CaseFive();
    }
}
finally, at the call site, we write this:
{
    int value =
        dispatchOnFoo<int> (foo, bar,
            new DispatchOnFooCases<int> ()
            {
                @Override
                int caseOne (void)
                {
                    ... block 1 ...
                    return aValue;
                }

                @Override
                int caseTwo (void)
                {
                    ... block 2 ...
                    return aDifferentValue;
                }

                ... etc. ...
            });
}
The good news is that we've accomplished our goal of abstracting the complex conditional dispatch from the code that does the real work — the method bodies at the call site.

There is, unfortunately, a fair amount of bad news. First, if you thought lambda expressions introduced clutter, then this is a serious amount of clutter. Between @Overrides, type declarations, interfaces, and methods, there is just a lot of extra stuff you have to type. It still might be worth the clutter if the dispatch conditions are complex enough. They just need to be that much more complex to justify all this machinery. (We've actually done the work the compiler would do to allocate and pass a “multi-closure”.) There are cases where this pays off, though.

The second piece of bad news is that Java is not (in general) tail recursive. This means that the call to dispatchOnFoo and the callback to one of the cases both introduce a new stack frame. So although the case methods run in the same lexical environment as where they are defined, they are running two stack frames deeper. This won't make much of a difference unless you try to loop by recursively calling the code. In that case, you need to be very careful to limit the amount of recursion or you will overflow the stack. It is best to avoid recursion as much as possible in the bodies of the cases.

You probably won't need to resort to this doing this. It can be a case of the cure being worse than the disease. The complexity of introducing callbacks can exceed the complexity of the conditional you are trying to abstract. But this is an interesting way to abstract a very complex conditional and can come in handy when you can justify using it. I have actually used this technique in production code to separate some complex control flow from the code that did the actual work.

Sunday, January 26, 2020

The pros and cons of Agile

Agile methodology is currently the popular way of attempting to develop commodity software in a factory-like manner.  It's a little hard to define what exactly is Agile and what is not.  I've worked in several companies and with several groups in companies that all claim to be Agile, yet they did things in very different ways.  But they all seemed to agree on a few things, and this I suppose could be the spirit of Agile, if not the letter.

The basic characteristic of Agile is that you break down the software into smaller and smaller tasks until you reach those tasks that can be comfortably handled by a small (2-8 person) team of engineers to complete in a small (1-2 week) time frame.  Then at the end of the time frame, you supposedly have some software that “works” in some sense of the word.  It exhibits basic functionality and the general gist of what the customer wanted, if not satisfying all the requirements for the entire project. Then over the next several time periods of development, the software is iteratively improved by fleshing out remaining requirements, adding needed functionality, hooking components together, etc. During this time, the customer is involved to make sure that what is being delivered is actually what is needed.

Some projects I worked on did this formally by a book and followed strict guidelines for the methodology, others just winged it, but all had the basic characteristics above.

One of the primary advantages of Agile is its use to management.  By having the large problem broken down into team-size, biweekly pieces, management can easily allocate and track resources usage and progress.  They can treat a team as a black box of software development and assign tasks as they arise.  Management can attempt to measure the performance of a team and see whether it is increasing, decreasing, or remaining steady.  Teams are what managers like to manage.

Another advantage is frequent feedback from the customer.  Since after each time period, a somewhat working version of some fragment of the product is available for demonstration, the customer can give feedback as to how and whether this seems to meet his needs.  He can offer suggestions about what might be improved, what features he needs to make the product at least minimally useful to him, and prevent development from getting off track.

But Agile is not a panacea.  There is still a significant amount of software produced with the traditional “waterfall” methodology with specification completed before coding begins and integration done as a final step in coding and only then releasing to the customer.  There is also a fair amount of software written “artistically”. I would define artistic software as that written by a single developer working alone over a period of several months. Frequently, such a project never gets beyond the hobbyist stage, and as such it is a risky approach to writing production code. But on occasion, an artistic project can turn into something novel and useful. It more often exhibits a unity of vision and coherence that is harder to find in software written by groups of people. (Which isn't to say that small groups cannot write software with unity of vision and coherence, it's just harder. Or they'll have one particular person in the group that has more insight than the others.)

Managers aren't as useful to artistic developers. Artistic developers tend to manage themselves. And you cannot swap out one developer for another without swapping out the entire project with him. A manager can work with an artistic developer as a peer, and help manage the project, but cannot manage the developer.

Frequently involving customers has its pros and cons as well. Often customers have difficulty imagining anything beyond incremental improvements to the current ways of doing things. They'll want a UI widget that will make some task slightly easier, but not think of automating the task altogether. They'll want to automate a series of inefficient tasks when a different viewpoint of the end result might make those intermediate tasks irrelevant. Customers are not impressed with changes to the code that don't produce visible effects. You may have spent a week refactoring in order to make it trivial to add new commands and new outputs, but customers don't care. Customers don't care about potential use cases, they care about their specific use case to the exclusion of everything else. This can be discouraging to developers.

Because Agile is so useful to managers, big and intermediate sized companies will continue to use it to develop commodity software in a factory-like style. It isn't going to be replaced any time soon. But there is still ample room in the market for small companies and individuals with vision to carve out niches that Agile methodologies will overlook and find tricky to fill.

(But I'm a romantic at heart, and I like the image of the lone hacker crafting software on his home computer in his room late at night. If only it were easy to make a living that way.)

Tuesday, January 21, 2020

But what if you really want to push a stack frame?

If you really don't want tail recursion, the solution is simple: don't put the call in “tail position”. We define a rather trivial function dont-tail-call and use it like this:
(dont-tail-call
  (foo x))
The semantics of Scheme are that the arguments are evaluated before the call to the function, so the call to foo is required to occur before the call to dont-tail-call which necessitates allocation of a continuation.

But what about a really clever compiler that somehow “optimizes” away the call to dont-tail-call? Such an “optimization” is highly questionable because it changes the observable semantics of the program so it is no longer call-by-value. A compiler that performed that transformation wouldn't be a Scheme compiler because Scheme is a call-by-value language.

But what if we had really clever compiler and we disregard the change in semantics? We can easily thwart the compiler by deferring the definition of dont-tail-call until run time. Even the cleverest compiler cannot see into the future.

The definition of dont-tail-call is left to the reader, as is how to defer it's definition until run time.

Afraid of Tail Recursion

It's well known fact among proponents of tail recursion that some people just don't get it. They view tail recursion at best as a quirky compiler optimization that turns some recursive calls into loops. At worst, they see it as some sort of voodoo, or a debugging pessimization. They see little value in it. Some have outright disdain for it.

Tail recursion isn't just about turning recursive calls into loops. It's about changing how you look at function calling. Tail recursion just happens to fall out of this new viewpoint.

Most programmers, I think, view function calls as if they were akin to a short vacation. You pack up the arguments in your luggage, travel to the destination, unpack your luggage, do some stuff, repack your luggage with some souvenirs, return home, unpack everything and resume life where you left off. Your souvenirs are the return value.

Should you need a vacation from your vacation, you do the same thing: pack up the arguments in your luggage, travel to your new destination, unpack your luggage, do some stuff, repack your luggage with some souvenirs, return to your original vacation spot and resume your original vacation.

Tail recursion aficionados realize that the journey itself is the important part of the function call, and that a vacation includes two journeys. On the first journey you pack up the arguments, including the return ticket, in your luggage, use the outbound ticket to journey to the destination, unpack your luggage, and start doing stuff. When you run out of stuff to do, you make the second journey. You fetch the return ticket, repack your luggage, take the ticket to wherever it leads (presumably back home), unpack everything, and resume doing whatever you were doing there.

But perhaps you want to visit grandma instead of going directly home. Then we change the script slightly. When you run out of things to do on your vacation, you pack up your luggage with your souvenirs and the return ticket, then you journey to grandma's house, where you unpack and start doing stuff. Eventually you are done visiting grandma, so then you fetch the return ticket, repack your luggage, take the ticket to wherever it leads, unpack everything, and resume doing stuff there. It's a three-legged journey. You don't go from grandma's back to the vacation resort — there's nothing left for you to do there. You take the return ticket directly home.

Viewing things this way, a function call involves packaging the arguments in a suitable way, deallocating any temporary storage, and then making an unconditional transfer to the function, where we unpack the arguments and resume execution of the program. It is simply “a goto that passes arguments”.*

A function return is simply “a goto that passes a return value”. It involves packaging the return value in a suitable way, deallocating any temporary storage, and then making an unconditional transfer to the return address, where we resume execution of the program.

A tail recursive function call is simply “a goto that passes arguments”. It involves packaging the arguments in a suitable way, deallocating any temporary storage and then making an unconditional transfer to the function, where we resume execution of the program.

Do we really deallocate temporary storage before every control transfer? Certainly a return pops the topmost stack frame, and as often implemented, a tail recursive function call deallocates its stack frame or replaces it before transferring control, but a non tail recursive call? It does so as well, it's just that it also has to pack those values into a new continuation for the return trip. We use an implementation trick to avoid the absurdity of actually moving these values around: we move the base of frame pointer instead. Voila, we simultaneously deallocate the stack frame and allocate the continuation with the right values already in place.

Deallocating storage before each control transfer is an important part of the protocol. We're making a unconditional transfer to a destination with the hope, but no guarantee, that we'll come back, so we'd better tidy up before we leave. This ensures that we won't leave a trail of garbage behind us on each transfer which would accumulate and adversely affect the space complexity of our program.

Once you view a function call and return as not being a single sequence, but each one a separate, and virtually identical sequence, then tail recursion becomes a natural consequence. Tail recursion isn't a special case of function call, it is the same thing as a function call, the only difference being whether a new continuation (the "return ticket") is allocated in order to come back. Even function returns are the same thing, the only difference being that destination is (usually) determined dynamically rather than statically. Tail recursion isn't just another optimization, it's the result of treating inter-procedural control transfer symmetrically.

Another natural consequence is greatly increased options for control flow. Instead of a strict call/return pattern, you can make "three-legged" trips, or however many legs you need. You can make loops that incorporate one, two, or even a dynamically changing number of functions. You can replace compiler-generated returns with user-provided function calls (continuation-passing style) and implement arbitrarily complex control and data flow like multiple return values, exception handling, backtracking, coroutines, and patterns that don't even have names yet. And of course you can mix and match these patterns with the standard call and return pattern as well.

The phrase "tail recursion" is shorthand for this symmetric view of interprocedural control flow and is meant to encompass all these consequences and control flow options that spring from it. It's not about simply turning recursive functions into loops.

People who are afraid of tail recursion seem unable to see any value in taking up the symmetric viewpoint despite the fact that it opens up a whole new set of control flow techniques (in particular continuation-passing style). They find the notion that a procedure call is “a goto that passes arguments” “nonsensical”. A lot of good research has come from taking this nonsense seriously.


*The view that a function call is simply a “a goto that passes arguments” was developed by Steele in his “Lambda papers”.

The important point of cleaning up before the control transfer was formalized by Clinger in “Proper Tail Recursion and Space Efficiency”.

Someone — it might have been Clinger, but I cannot find a reference — called tail recursion “garbage collection for the stack”. The stack, being so much more limited in size than the heap, needs it that much more. Indeed Clinger notes the tight connection between tail recursion and heap garbage collection and points out that heap garbage collection is hampered if the stack is retaining pointers to logically dead data structures. If the dead structures are large enough, garbage collection can be rendered useless. Yet many popular languages provide garbage collection but not tail recursion.

The only difference between a call and return is that typically the call is to a statically known location and the return address is dynamically passed as a "hidden" argument. But some compilers, like Siskind's Stalin compiler, statically determine the return address as well.

The only difference between a function call and a tail recursive function call is when you need to return to the caller to complete some work. In this case, the caller needs to allocate a new continuation so that control is eventually returned. If there is no further work to be done in the caller, it doesn't create a new continuation, but simply passes along the one that was passed to it.

Many compilers have been written that handle function calls, tail recursive function calls, and returns identically. They only change what code they emit for handling the continuation allocation. These compilers naturally produce tail recursive code.

Most machines provide special purpose support for a LIFO stack. It is tempting to use the stack for allocation of continuations because they are almost always allocated and deallocated in LIFO order, and a stack gives higher performance when this is the case. Many compilers do in fact use the stack for continuations and argument passing. Some, like Winklemann's Chicken compiler follow Baker's suggestion and treat the stack as an "nursery" for the heap. Others avoid using the stack altogether because of the extra complexity it entails. And others cannot use the stack because of constraints placed on stack usage by OS conventions or the underlying virtual machine model.

Sunday, January 19, 2020

Afraid of Recursion

Here's a trick I learned several years ago for transferring time-series data. In the case in question, I needed to transfer a bunch of timestamped records, but the server had a quirk to it. If you asked for too many records at once, it would simply return an error code and give up on your request. There was no way to know beforehand how many records might exist in any given time span, so you could get an error code on nearly any request, unless it was for a very short time span. On the other hand, many requests for long time spans would succeed because they had few records in them. Despite this quirk, the code was really simple:
List<record> xfer (Timestamp start, Timestamp end) {
    try {
        return tryFetchRecords(start, end);
    } catch (TooManyRecordsException e) {
        Timestamp mid = (start + end)/2;
        List<record> firstHalf = xfer (start, mid);
        List<record> secondHalf = xfer (mid, end);
        return firstHalf.addAll(secondHalf);
    }
}
On any request, if the server returned the error code, we would simply bisect the time span, recursively ask for each half separately, and combine the two halves. Should the bisected time span still contain too many records, the time span would be bisected again. The recursion would continue until the time span was small enough that the server could honor the request and return some records. The recursion would then unwind, combining the returned records into larger and larger lists until we had all the records for our original time span. Since the time span would be cut in half on each recurrence, the depth of recursion would be proportional to the logarithm (base 2) of the total number of records, which would be a reasonably small number even with an enormous number of records.

It's certainly possible to avoid recursion and do this iteratively by some form of paging, but the code would be slightly more complex. The server is not very cooperative, so there is no easy way to determine an appropriate page size beforehand, and the server doesn't support a “paging token” to help keep track of progress. The recursive solution finds an appropriate transfer size by trial and error, and keeps track of progress more or less automatically. An iterative paging solution would have to do these things more explicitly and this would make the iterative code a bit more complex. And why add any complexity when it isn't really necessary?

I thought this solution was really cool when I first saw it. I've used this trick for transferring time series data many times. It makes the server very simple to write because the protocol requires so little of it. It simply has to refuse to answer requests that return too many results. The client code is just about the 10 lines above.

But when I first suggest this code to people I usually get “push back” (that is, until they see it work in action, then they usually get on board with it). People seem unsure about the use of recursion and want a more complex protocol where the client and server negotiate a page size or cooperatively pass a paging token back and forth on each request and response. Their eyes glaze over as soon as they see the recursive call. They want to avoid recursion just because it's recursive.

I've seen “aversion to recursion” happen in a lot of circumstances, not just this one. Recursion isn't the solution to everything. No tool solves all problems. But it is an important tool that often offers elegant solutions to many problems. Programmers shouldn't be afraid of using it when it is appropriate.

Saturday, January 18, 2020

Unsyndicated blog

I've noticed that my blog posts are replicated in Planet Lisp and Planet Scheme, and here I am spamming them with random math stuff. So I'm creating a new blog, Jrm's Random Blog, where I can feel free to post about math, science, computers in general, and whatever else bugs me, without spamming the Lisp and Scheme readers. I'll keep posting to Abstract Heresies, but try to keep it more Lisp and computer language focused.

Thursday, January 16, 2020

Groups, semigroups, monoids, and computers

The day after I rant about mathematicians, I make a math post. “Do I contradict myself? Very well, then, I contradict myself, I am large, I contain multitudes.” — Walt Whitman

A group is a mathematical concept. It's pretty simple. It consists of a set, G, and an operation, *, which can be used to combine any two elements of G. What the set contains is not that important. It is the * operation we're interested in, and we can usually swap out G for another set without causing too many problems other than having to change the type signature of *. There are four axioms that * must obey
  • Closure—combining any two elements of G using * just gives you another element in G.
    Note that this means you can build an arbitrary binary tree of combinations: e.g.(* (* a b) (* (* c d) e))). These trees will always be like a tree of cons cells. In some sense, the closure axiom is equivalent to saying that all the elements of G have the same type and that the * operator operates on values of that type and produces values of that type. The closure axiom along with the binary operation means that we can reduce any tree of combinations to a single value.
  • Associativity(* (* a b) c) = (* a (* b c)) for any a, b, and c. This implies that you can take any arbitrary tree of combinations: e.g.(* (* a b) (* (* c d) e))) and simply flatten it into a list (* a b c d e), or given the flat sequence (* a b c d e) we can add parenthesis anywhere we like: (* a (* b c) d e). If we stop here and only have the closure and associativity axiom, we have what is called a “semigroup”. You can use the * operation to “fold” a semigroup down to single value, or to keep an accumulator and incrementally fold elements into the accumulator.
  • Identity element—There has to be an identity element id such that (* id x) = (* x id) = x for all x. It will be unique. If you see the identity object in a combination (* a b id c d), you can simply remove it: (* a b c d). The identity element also comes in handy as an initial value when you are folding a sequence. If you have some concept that would be a group except it doesn't have an identity element, then you can often just make one up and add it to the set G.
  • Inverse element—For every element in G there has to be another element, that when combined with the first, gives you the identity. So if a is an element in G, there has to be some other element, call it b, such that (* a b) = (* b a) = id. The inverse element is usually notated with a little -1: a-1. If you have an element in a combination right next to it's inverse: (* a x x-1 c), you can combine the element and it's inverse to get the identity: (* a id c), and then remove the identity: (* a c)
Frequently you run into something that obeys all the axioms but the inverse element axiom. This is called a monoid. A monoid is very much like a group except that you can get “stuck” when manipulating it if you run into one of the non-invertible elements because there's no inverse to “undo” it. There are certain things about monoids that are true only “if the appropriate inverses exist”. You run into that qualifier a lot when dealing with monoids. You don't need that qualifier if you are dealing with a group because they do exist by axiom. Or we could say that calling something a group is simply shorthand for adding “if the appropriate inverses exist” everywhere.

What does this have to do with computers? Consider the set of all subroutines with the operation of concatenation. It is closed — concatenating two subroutines gives you a third subroutine. It is associative — you just concatenate them linearly. There is an identity element, usually called no-op. And many, but not all, subroutines have inverses. So we have a monoid.

Consider the set of all strings with the operation of concatenation. It is closed, associative, the empty string is the identity element. It is a monoid.

Consider the set of functions whose input type is the same as the result type with the operation of composition. It is closed, associative, the identity function is the identity element. It is a monoid. If we consider only the subset of functions that also have inverses, we have a group. This particular monoid or group comes in especially handy because composition of functions is so useful.

Consider the set of invertible 2x2 matrices with integer components, a determinant of 1 or -1, and the operation of matrix multiply. It is closed, associative, there is an identity matrix, and I already said just consider the invertible ones. It forms a group. This group comes in handy for implementing arbitrary precision arithmetic. (Thanks to Bradley Lucier for the correction of the condition on the determinant. This makes the matrix continue to have integer components upon inversion, keeping things closed.)

The permutations of a list form a group. The integers under addition form a group.

These things are everywhere. And it isn't a coincidence. The concepts of a group, monoid, and semigroup are meant to capture the essence of what it is to have a foldable sequence of elements. (Can I complain about mathematicians here? They make up so much terminology and abstraction that it is virtually impossible to get at what they really mean. We're just talking about sequences of elements and trying to find some minimal axioms that you need to have to fold them, but try to find literature that actually says that's what we're doing is like trying to pull hen's teeth.)

So what good are groups, monoids, and semigroups? Aside from the obvious fact that foldable sequences are ubiquitous and really useful, that is. Not immediately apparent from the axioms is that in addition to folding a sequence, you can transform a sequence into a different, but equivalent one. If the appropriate inverses exist (there's that phrase), you can “unfold” some or all elements of a sequence. So by judicious folding and unfolding, you can transform a sequence.

Here's an unusual abstract example. Consider a pipeline which has a set of nodes and communicates values of the same type between the nodes. Values accumulate at the nodes until they are transmitted to the next node in the pipeline. We start with all the values in the initial node (on the right) and transmit them to the left:
(pipeline (node) (node) (node a b c))  ;; transmit the a
(pipeline (node) (node a) (node b c))  ;; transmit the b
(pipeline (node) (node a b) (node c))  ;; transmit the a
(pipeline (node a) (node b) (node c))  ;; transmit the c
(pipeline (node a) (node b c) (node))  ;; transmit the b
(pipeline (node a b) (node c) (node))  ;; transmit the c
(pipeline (node a b c) (node) (node))  ;; done
If the values we transmit are drawn from a group, we can replace each node with the group's * operator:
(* identity identity (* a b c))  ;; transmit the a
(* identity (* identity a) (* b c))  ;; transmit the b
(* identity (* a b) (* identity c))  ;; transmit the a
(* (* identity a) (* identity  b) (* identity c))  ;; transmit the c
(* (* identity a) (* b c) identity)  ;; transmit the b
(* (* a b) (* identity c) identity)  ;; transmit the c
(* (* a b c) identity identity)  ;; done
The astute reader will notice that all we're doing is making use of the associativity axiom and moving the parenthesis around so that the values seem to move between the different nodes. But we preserve the invariant that the “value” of the entire pipeline doesn't change as the values move. The * operator need not be concatenate, which would give simple queuing behavior, but can be any operator satisfying the axioms giving us much more interesting pipelines. One implementation of arbitrary precision arithmetic transmits Möbius transformations along just such a pipeline to refine the upper and lower limits of a computed approximation. In this implementation, the * operator is the composition of Möbius transformations.

Here's a more concrete example. If you have a series of nested functions: (f (g x)) and both f and g take and return the same type, rewrite it as ((compose f g) x) and use a little group theory on it.
(f (g x))
((compose f g) x)
;; or more explicitly
((fold-left compose identity (list f g)) x)
If the appropriate inverses exist, then there will be another function h such that (compose f g) is equal to (compose h f) essentially allowing you to “slide” g to the left “through” f. It is relatively easy to see that h must be equivalent to (compose f g f-1). Mathematicians say that h is conjugate to g. Conjugates always have a form like aba-1. By finding conjugates, you can take a sequence and slide the elements left and right through other elements. This also allows you to fold things out of order. (Or in the pipeline example, transmit items out of order.) If we were left folding into an accumulator, folding h before f is equivalent to folding g after f. Another way of looking at it is this. Suppose we're standing to the left of f and looking through the “lens” of f at g. h is what g “looks like” when viewed through f.

If we want, we can define slide such that (compose slide (compose f g)) is equivalent to (compose h f). slide is (compose h f g-1 f-1). (This isn't a generic slide sequence, it only works on (compose f g). It ought to be an identity because (compose f g) is equivalent to (compose h f).) I complained that mathematicians provided too few concrete examples, so here is a concrete example using list permutations:
> (reverse (rotate-left '(a b c d)))
(a d c b)

;; rewrite as explicit fold-left of compose
> ((fold-left compose identity (list reverse rotate-left)) '(a b c d))
(a d c b)

;; sliding rotate-left through reverse turns it into rotate-right
> ((fold-left compose identity (list rotate-right reverse)) '(a b c d))
(a d c b)

;; A sequence that when composed with (list reverse rotate-left) turns it into
;; (rotate-right reverse)
> (define slide 
    (fold-left compose identity (list rotate-right reverse rotate-right reverse)))
slide

> ((fold-left compose identity (list slide reverse rotate-left)) '(a b c d))
(a d c b)

;; rewrite back to direct procedure calls
> (rotate-right (reverse '(a b c d)))
(a d c b)

;; and slide ought to be an identity
> ((fold-left compose identity (list slide)) '(a b c d))
(a b c d)

Or suppose you have (f (g x)), but for some reason you want(g (f x)) (which would, in general, be a different value unless f and g happen to commute). Again, rewrite (f (g x)) as ((compose f g) x) and apply a little group theory. If the appropriate inverses exist, there will be a function commute-fg such that (compose commute-fg (compose f g)) is equivalent to (compose g f). With a little thought, you can see that commute-fg is equivalent to (compose g f g-1 f-1). (Again, this isn't a generic commute, it only causes this specific f and g to commute.) commute-fg is called a commutator because it makes f and g commute. Commutators always have the form aba-1b-1. By finding commutators and inserting them in the right place, you can take a sequence and swap adjacent elements. Again, a concrete example with lists:
;; an illustration of what swap-first two does
> (swap-first-two '(a b c d))
(b a c d)

;; we're given
> (reverse (swap-first-two '(a b c d)))
(d c a b)

;; but we want, for some reason to reverse first
> (swap-first-two (reverse '(a b c d)))
(c d b a)

;; rewrite as fold-left of compose
> ((fold-left compose identity (list reverse swap-first-two)) '(a b c d))
(d c a b)

;; define our commutator
;; note that swap-first-two and reverse are their own inverses
> (define commute-fg
    (fold-left compose identity (list swap-first-two reverse swap-first-two reverse)))

;; make f and g commute
;; observe that it returns the desired result
> ((fold-left compose identity (list commute-fg reverse swap-first-two)) '(a b c d))
(c d b a)

There's two interesting things here. First, notice that in both examples I convert (f (g x)) to ((fold-left compose identity (list f g)) x) and then proceed to ignore x and just consider (fold-left compose identity (list f g)) as if x didn't exist. I've abstracted away the x. (Of course I have to eventually supply the x if I want an answer, but it only comes back at the last moment.) Second, notice that although slide and commute-fg are foldable sequences, I use them as if they were higher order functions operating on the foldable sequence (compose f g) to transform it, first into (compose h f), second into (compose g f). This second thing is a neat trick. We're taking a function that operates on lists and treating it as if it were a higher-order function that operates on functions. This is called the “action” of slide and commute-fg because it appears as if elements of the set G of our group can “act” directly on other elements.

Every element in the underlying set G of a group has an action associated with it which operates directly on other elements in G. This is an important concept in group theory. Now earlier I said that the actual elements of G don't matter much, so the action must be more closely tied to the operator *. And if we swap out G for another set we'll still have the same actions, they'll just be associated with the elements of the new set (in an isomorphic way). The actions are pretty abstract.

There's a lot more one could say about the actions. They are a rich source of interesting math. My brain is getting fatigued with all this abstraction, so I'll leave the topic be for now.

If group theory is about the essence of what it means to have a foldable sequence, then category theory is about the essence of composition. They offer two somewhat different approaches to similar material. What do you do with sequences but compose them? What comes from composition but a sequence? Many concepts in group theory carry over into category theory. Naturally a completely different set of terminology is used, but the concepts are there.

But that's enough group theory for today and category theory can wait until later posts.

Wednesday, January 15, 2020

Math is hard, let's go shopping

I find mathematics, with all it's weird terminology and abstraction and equations, hard to understand. That's kind of funny coming from someone like me who makes a living from a branch of mathematics. I find computers and programming to be rather easy to understand — probably because I've had a lot of practice. But computer science is just applied logic and programming is arguably just the study of the computable functions, so you'd think math would come naturally. It doesn't.

One problem I've found is that as much as mathematicians pride themselves on rigor, they tend to be a bit sloppy and leave out important details. Computer scientists don't leave out important details because then the programs won't run. It's true that too much detail can clutter things up, but leaving out the detail and relying on “context” just increases the intellectual burden on the reader.

I will give mathematician's credit for thinking about edge cases perhaps more than a computer scientist would. It can be easy to be a bit complacent with edge cases because the computer will likely do something even if you don't think too hard about what it ought to do. But a good computer scientist tries to reduce the number of edge cases or at least make them coherent with the non-edge cases.*

Mathematicians seem to take perverse pleasure in being obscure. Computer scientists strive to be as obvious as possible because like as not, they are the ones that have to revisit the code they wrote and don't want to have to remember what they were thinking at the time. It's just easier to spell things out explicitly and obviously so that you can get back up to speed quickly when you have to debug your own stupid code. Every time I pick up some literature on category theory, I get hit with a “Wall of Terminology” denser than the “Wall of Sound” on a Phil Spector recording. It's fundamentally simple stuff, but it is dressed up in pants so fancy one has a hard time extracting the plain meaning. What seems to be universal in category theory is my difficulty in getting past page 4.

I once read a mathematical paper that talked about an algorithm with three tuning parameters: α, β, and another α. No decent computer programmer would give the same name to two different variables. Which α was which was supposed to be “obvious” from the context. The brainpower needed to keep track of the different αs was absurd and a complete waste of effort when calling the variable something else, like γ would have done the trick.

And don't ask a mathematician to write computer code. That's the one time they'll leave out all the abstraction. Instead of a nice piece of abstract, functional code, you'll get a mess of imperative code that smashes and bashes its way to a solution with no explanation of how it got there. It's a lot easier to take some abstract, functional code and figure out a more optimal way, probably imperative way to do it than it is to take a more optimal imperative piece of code and figure out the abstract, functional meaning of it.

I've found it to be extremely helpful when a computer paper includes one or two concrete examples of what it is talking about. That way, if I try to go implement code that does what the paper suggests, there's some indication that I'm on the right track. I'm more confident that I understand the paper if I have working code that produces the exact same values the paper's authors got. It's harder to find concrete examples in a math paper, and it is easier to think you know what it says but be far off base if there aren't any examples.

Maybe I shouldn't blame mathematicians so much and look a little closer to home. Perhaps I should study harder instead of demanding to be spoon fed difficult concepts. But then I read Feynman, S&ICP, S&ICM, and Jaynes and discover that maybe I just need a simple explanation that makes sense to me.

Sturgeon's Revelation is “90% of everything is crap”. This is true of both mathematical papers and computer science papers.



*An old joke illustrates the importance of thinking of edge cases: A programmer implements a bar. The test engineer goes in and orders a beer, orders zero beers, orders 999999999 beers, orders -1 beers, orders a lizard, and declares the bar ready for release. The first customer comes in and asks to use the restroom. The bar catches fire and burns down.

Tuesday, January 14, 2020

Palindromes, redux, and the Sufficiently Smart Compiler

The Sufficiently Smart Compiler is mentioned by authors as shorthand for “a compiler that performs nearly all reasonable optimizations, but in particular this one I want”. Many attempts were made up through the 80's and maybe into the 90's to write a Sufficiently Smart Compiler that would perform all “reasonable” optimizations, and although many impressive results have been obtained, there always seem to be fairly obvious optimizations that remain unoptimized. These days it seems that people realize that there will be good compilers and some very good compilers, but never a Sufficiently Smart Compiler. Nonetheless, it is worth considering a Sufficiently Smart Compiler as a tool for thought experiments.

I was curious what would be necessary for a Sufficiently Smart Compiler to generate optimal code for the palindrome problem given the naive algorithm.

The naive algorithm is inspired by the axioms
  • A zero or one element string is a palindrome.
  • If the first char matches the last char, and the middle is a palindrome, the result is a palindrome.
and gives us this:
(define (palindrome1? string)
  (or (< (string-length string) 2)
      (and (char=? (string-ref string 0)
                   (string-ref string (- (string-length string) 1)))
           (palindrome1? (substring string 1 (- (string-length string) 1))))))

The higher performing algorithm is inspired by the idea of keeping two pointers to each end of a string and comparing the characters at the pointers. If the characters are the same, you move the pointers inward and when they meet, you have seen a palindrome. If at any point the characters differ, you don't have a palindrome:
(define (palindrome2? string)
  (define (scan front-pointer rear-pointer)
    (or (>= front-pointer rear-pointer)
        (and (char=? (string-ref string front-pointer)
                     (string-ref string rear-pointer))
             (scan (+ front-pointer 1) (- rear-pointer 1))))
  (scan 0 (- (string-length string) 1)))
As you can see, these really aren't very different to start with. Both algorithms are iterative and both work their way in from the outside of the string. There are basically two differences. First, access to the rear of the string is either by a rear pointer, or by using the string-length of the string and subtracting 1. Second, the iterative call either uses substring or moves the pointers closer together.

First, let's assume that our processor has can reference through an indexed offset. This would mean we could point at the element one beyond the rear-pointer and not incur overhead. This isn't an unreasonable assumption for a CISC architecture such as an x86, but would probably cause 1 instruction overhead on a RISC architecture. So the second algorithm becomes this:
(define (palindrome2? string)
  (define (scan front-pointer rear-pointer)
    (or (< (- rear-pointer front-pointer) 2)
        (and (char=? (string-ref string front-pointer)
                     (string-ref string (- rear-pointer 1)))
             (scan (+ front-pointer 1) (- rear-pointer 1)))))
  (scan 0 (string-length string)))

Now this next assumption is a bit more of a stretch. The implementation of palindrome1? uses substring on each iteration and that's going to result in a lot of string copying. If our implementation used “slices” instead of copying the string, then there will be a lot less copying going on:
(define (palindrome1? string)
  (or (< (- (slice-end string) (slice-start string)) 2)
      (and (char=? (string-ref string (slice-start string))
                   (string-ref string (- (slice-end string) 1)))
           (palindrome1? 
             (substring string (+ (slice-start string) 1) (- (slice-end string) 1))))))

It is not uncommon for a compiler to introduce internal procedures for looping, so we can do that.
(define (palindrome1? string)
  (define (scan slice)
    (or (< (- (slice-end slice) (slice-start slice)) 2)
        (and (char=? (slice-ref slice (slice-start slice))
                     (slice-ref slice (- (slice-end slice) 1)))
             (scan (subslice slice (+ (slice-start slice) 1) (- (slice-end slice) 1))))))
  (scan (make-slice 0 (string-length string))))

We'll enter fantasy land again and let our compiler be smart enough to “spread” the slice data structure into the argument list of scan. This is no doubt asking too much from our compiler, but the information is available and it could in theory be done:
(define (palindrome1? string)
  (define (scan slice-start slice-end)
    (or (< (- slice-end slice-start) 2)
        (and (char=? (slice-ref string slice-start)
                     (slice-ref string (- slice-end 1)))
             (scan (+ slice-start 1) (- slice-end 1)))))
  (scan 0 (string-length string)))

And now we have palindrome2? (modulo renaming).

This doesn't really prove anything. But with a couple of somewhat unlikely compiler tricks, the naive version could be transformed to the more optimized version. It suggests that a it would be surprising but not a complete shock for an ambitious compiler writer to attempt.

I wish someone would write that Sufficiently Smart Compiler.

Monday, January 13, 2020

Cons cells vs. Linked Lists

Cons cells and linked lists are the meat and potatoes of Lisp programming. Linked lists are the primary structure that everything operates on and cons cells are the Lego blocks they are made of. For an experienced Lisp programmer, cons cells just fade into the background. You know they are there as the glue holding everything together, but it is the linked list that you keep in mind. One could construct all sorts of weird trees, dags, and graphs out of cons cells, but in general you keep things in nice linear singly-linked lists terminated with a nice, full-stop NIL.

Cons cells are nearly the perfect concrete implementation of an abstract two-tuple. They are first-class objects: you can assign them to variables, stuff them in arrays, pass and return them as values, and check them for identity. They are orthogonal to other data types; only a cons-cell returns 't to consp. They are opaque — except for the defined operations of car and cdr, you cannot access the contents of a cons cell. And while they are usually implemented as adjacent memory locations, they hide their representation and there have been many Lisps that have used unusual concrete representations of cons cells like parallel arrays of the car and cdr parts or bit codes to omit the cdr altogether through “cdr coding”. All operations on cons cells can be reduced to the basic operations cons, consp, car, cdr, (setf car), and (setf cdr). (If we had immutable cons cells, we could even get rid of the last two, but then we'd want some other means for creating circular and semi-circular structure.*)

So I find it somewhat surprising that the standard linked list implementation in Lisp is a just a terrible example of an abstract data type. This no doubt happened because linked lists got standardized well before abstract data types were really understood.

The big problem with linked lists is that instead of being orthogonal to other data types, it is a subdomain of cons-cells. The representation of a singly linked list is completely exposed: it is a cons cell, without even a wrapper object to tell you if you are dealing with the list itself or its representation. It is only by common convention that certain cons cell structures are considered to represent linked lists. And it isn't immediately clear whether the representation is meant to be a pointer to the first pair of the list, or to the entire “spine” of the list. It is often treated both ways. There is little distinction between a list primitive and a cons cell primitive, which usually doesn't get you into trouble, except in those few cases where it can cause major confusion, like when you have to handle “improper” or “dotted” lists.

Lists are mutable because their representation is mutable and not hidden. It is possible to mutate the representation such that it no longer represents a list anymore, “magically” changing any list that includes the mutated structure into something else. This means either a lot of defensive copying must be done if lists are used as arguments or passed as values, or an unenforced convention to avoid mutation of list structure must be developed in the Lisp culture. We've been pretty good at the latter, even documenting when you can and when you cannot rely on lists being mutated by library functions, but there are always a few people who go against the grain for the sake of “efficiency” (or plain orneriness) and write code that is impossible to use because you cannot easily tell what might be mutated behind your back.

With any abstract data type, there are conceptually a pair of functions that are used to transport objects across the abstraction barrier. One, call it abs->rep, takes an abstract object and exposes its representation. It is usually provided automatically by the compiler and called upon entry to the object's methods. In Java, for example, it establishes bindings for the this pointer and the private and protected fields of the object so that the method can use them. The complimentary function, call it rep->abs takes the representation of an object and hides it in an opaque, abstract version for clients of the object to use. The clients have no way to manipulate the representation of the object because they only have access to opaque, abstract version. In Java, for example, the compiler automatically does this after object construction and when the this pointer is returned properly cast to the abstract data type. The this pointer and private and protected fields of the object go out of scope and are no longer accessible.

These functions are usually provided by the compiler and often have no real implementation. The compiler simply ensures that representation comes into scope when the method is called (conceptually calling abs->rep) and that the representation goes out of scope when the method returns (conceptually calling rep->abs). No actual code is generated or exists at run time. It's easy to forget this is happening because the compiler does all the work for you. You just toggle the little bit in your head about whether you are “inside” the object or “outside” the object. If you forget, you can just examine the lexical nesting to see if the representation is in scope.

In Lisp, however, for a singly linked list, not only are these functions omitted, they are completely fictitious. It is only in the programmers head that what was once considered a linked list is now to be considered a pointer to head cell of list (abs->rep) and only probably in the programmers head that the reverse (rep->abs) is happening on the way out. It doesn't matter much if he or she forgets this because the written code is the same either way. It only matters if he or she somewhere down the line uses a cons-cell operation where a list operation is actually what should be used. This can lead to common rookie mistakes like
  • Using cons where list is wanted, yielding (1 . 2) where (1 2) is desired. (The “unwanted dot” problem.)
  • Using list where cons is wanted, yielding (1 (2)) where (1 2) is desired. (The “too many parenthesis” problem.)
  • Confusion about whether ((1 2) 3 4) is meant to be a three-tuple of a list and two integers, or a two-tuple of two lists. (It's both, depending on the unwritten intent of the programmer.)
  • Using cons or list where append is wanted, yielding ((1 2) 3 4) or ((1 2) (3 4)) when (1 2 3 4) is desired. (Again, “too many parenthesis”.)
  • Use of (append ... (list <element>)) to “cons” to the “right end” of a list, leading to O(n2) algorithms rather than O(n).
Now don't get me wrong. I like Lisp and I like linked lists. And I'm not suggesting we avoid using them in favor of some other well-designed abstract data type. I just think they're an awful example of how to implement an abstract data type and perhaps that's why it is difficult for beginners to learn how to use them properly. It might also be worthwhile to implement a Lisp with proper (and immutable) abstract linked lists. It wouldn't make much difference to experienced programmers who are already used to applying the representation/abstraction interface in their heads, but it might make it easier for novices to manipulate linked list and cons cells (and keep them apart).

If you want to be completely contrary, consider Olin Shiver's suggestion: all objects — cons cells, strings, integers, null, etc. — are lists. It's just that every object other than a cons cell is a zero element dotted list. Now rather than being a subtype of cons cells, lists become a supertype of all objects. This viewpoint can probably be made coherent, but it does raise a lot of questions. Here are some that come to mind:
  • Is (length '(1 2 . 3)) the same as (length '(1 2 3))? If not, what is (length '(1 2 . 3))
  • Should lists retain their “dottedness” when passed through functions like memq or map? What is (memq 2 '(1 2 . 3))? What about (memq 3 '(1 2 . 3))?
  • What is (reverse '(1 2 . 3))? Is (compose reverse reverse) an identity?
This was extensively discussed on the SRFI-1 mailing list, so I won't rehash the discussion here. The questions I raised above, and many more, were raised and discussed. Eventually, it was decided that continuing to be backwards compatible was an important consideration. (Personally, I think the notion plays havoc with the group theoretic properties of lists, and that is enough to make it suspect.)

There is a good argument that “dotted” lists are rarely used and almost always a mistake, but they are built in to the grammar of Scheme as an indicator of “rest” arguments, so getting rid of them would require some other way to specify “rest” arguments. Racket takes things further by allowing doubly dotted lists to indicate infix notation: (a . < . b)

Just for kicks, I took things in the other direction and wrote some C# code that implements singly-linked lists as their own abstract data type using special, immutable cons cells that require that their CDR be either an existing singly-linked list or the empty list. “Dotted” lists are not a problem because you simply cannot construct one. The representation of a list is explicitly coded as a pointer to the head cons cell of the list. The code illustrates how the abstract list is turned into a the pointer to the cons cell when it is carried across the abstraction barrier and how it is turned back into an abstract list when carried back out. Again, I'm not suggesting anyone use the code, or take it as a serious proposal. (For one thing, it doesn't address what to do about circular lists, or the dotted lists in the Scheme grammar.) It was just a fun hack for illustrative purposes. It is available here for those interested.

*Many years back, Henry Baker said “C'mon, cons cells should just be immutable.” (if I am remembering the exact quote correctly). I agree with his sentiment. Combine immutable cons cells with “hash consing” and the appropriate equality primitives and you get directed acyclic graphs (and their space properties) “for free”. We'd either have to do without circular structure or use another means to achieve it. Since circular structure often leads to divergent programs I wouldn't consider it a great loss, but some may disagree. Perhaps they might be assuaged by a nice set of primitive procedures for creating and manipulating circular cons cell structure.

Sunday, January 12, 2020

Just for fun, transformations on lists

The mathematician in me likes to think about what happens to data and programs when you apply certain transformations to them. Here's a simple example. Imagine the function swap that simply makes a new cons cell by swapping the car and cdr of an existing cons cell:
(define (swap cell) (cons (cdr cell) (car cell)))

> (swap '(1 . 2))
(2 . 1)

> (swap (swap '(1 . 2)))
(1 . 2)
As we'd expect, two swaps are equivalent to no swaps at all. Indeed any even number of swaps are equivalent. Any odd number of swaps is equivalent to one swap.

What if we call swap on a list?
> (swap '(1 2 3))
((2 3) . 1)
That's odd looking. But swapping again returns it to normal
> (swap (swap '(1 2 3)))
(1 2 3)

But swap only swaps the top-level cell. Let's define deep-swap that descends into the car and cdr if possible:
(define (deep-swap cell)
  (cons (if (pair? (cdr cell)) (deep-swap (cdr cell)) (cdr cell))
        (if (pair? (car cell)) (deep-swap (car cell)) (car cell))))

> (deep-swap '((1 . 2) . (3 . 4)))
((4 . 3) 2 . 1)
Wait, what? Oh, the list printer is just interpreting the second cons cell as a part of a top-level list. We can see this by trying this:
> '((4 . 3) . (2 . 1))
((4 . 3) 2 . 1)
So we just have to be aware of list printer eliding the dots for us.

What if we call deep-swap on a list?
> (deep-swap '(1 2 3 4))
((((() . 4) . 3) . 2) . 1)
Fortunately, deep-swap, like swap, undoes itself.
> (deep-swap (deep-swap '(1 2 3 4)))
(1 2 3 4)
It's easy to see that swap and deep-swap should commute.
> (equal? (swap (deep-swap '((a . b) . (c . d))))
          (deep-swap (swap '((a . b) . (c . d)))))
#t
Alternatively, define compose
;; Simple composition of two functions
(define (compose2 outer inner)
  (lambda (x) (outer (inner x))))

;; Composition of arbitrary number of functions
(define (compose f . fs)
  (if (null? fs)
      f
      (compose2 f (apply compose fs))))

> (equal? ((compose swap deep-swap) '((a . b) . (c . d)))
          ((compose deep-swap swap) '((a . b) . (c . d))))
#t
So you can just move all the swaps together and all the deep-swaps together, then remove pairs of each one.

swap and deep-swap don't have very complex behavior, so let's turn to lists. We can define rotate-left as follows:
(define (rotate-left l) (append (cdr l) (list (car l))))

> (rotate-left '(1 2 3 4))
(2 3 4 1)

> (rotate-left (rotate-left '(1 2 3 4)))
(3 4 1 2)
(This is horribly inefficient, so this is just for fun, not production code.) Now what happens when we combine rotate-left with reverse?
> (reverse (rotate-left (reverse '(1 2 3 4))))
(4 1 2 3)

(define rotate-right (compose reverse rotate-left reverse))

> (rotate-right '(1 2 3 4))
(4 1 2 3)
rotate-left becomes rotate-right when used “under” reverse. Of course rotate-left doesn't commute with reverse: (reverse (reverse (rotate-left '(1 2 3 4)))) the reverses cancel each other and we're left with a rotate-left.

We can define “deep” versions of reverse, rotate-left, and rotate-right:
(define (deeply f)
  (lambda (l)
    (if (list? l)
        (f (map (deeply f) l))
        l)))

> ((deeply reverse) '((1 2 3) 4 5 (6 7 (8 9 10))))
(((10 9 8) 7 6) 5 4 (3 2 1))

> ((deeply rotate-left) '((1 2 3) 4 5 (6 7 (8 9 10))))
(4 5 (7 (9 10 8) 6) (2 3 1))

> ((deeply rotate-right) '((1 2 3) 4 5 (6 7 (8 9 10))))
(((10 8 9) 6 7) (3 1 2) 4 5)
Naturally, a (deeply rotate-left) will undo a (deeply rotate-right). You might suspect that the composition of (deeply reverse), (deeply rotate-left), and (deeply reverse) is equivalent to (deeply rotate-right), and you'd be right (I suspected as much, too, but it didn't seem so obvious, so I checked).

Notice that the deeper list structure has 3 elements each, but the topmost list structure has 4 elements, so 3 deep rotations is equivalent to one shallow rotation in the opposite direction, or (compose rotate-left (deeply rotate-left) (deeply rotate-left) (deeply rotate-left)) is an identity. In fact, the shallow rotate-left commutes freely with (compose (deeply-rotate left) (deeply rotate-left) (deeply rotate-left))
;; These are all equivalent identities
(compose rotate-left (deeply rotate-left) (deeply rotate-left) (deeply rotate-left))
(compose (deeply rotate-left) rotate-left (deeply rotate-left) (deeply rotate-left))
(compose (deeply rotate-left) (deeply rotate-left) rotate-left (deeply rotate-left))
(compose (deeply rotate-left) (deeply rotate-left) (deeply rotate-left) rotate-left)

Arbitrary application of these operators will scramble your list structure much like arbitrary rotations will scramble a Rubik's cube. The analogy is more than skin deep: group theory can be used to describe and analyze the combinatorics of both. Group theory tells us that operations of the form F-1GF are likely to be interesting. And indeed:
> ((compose rotate-right reverse rotate-left) '((1 2 3) 4 5 (6 7 (8 9 10))))
(4 (1 2 3) (6 7 (8 9 10)) 5)

(define involute (compose rotate-right reverse rotate-left))
swaps the outside elements with the inside ones. And if we compose a rotate-left with this, we find we've reversed only the last three elements in the list ((1 2 3) (6 7 (8 9 10)) 5 4).

Just using these operators, there seems to be no way to get to get to '((1 2 3) 5 4 (6 7 (8 9 10))) (at least I couldn't find one), so I defined another operator:
(define (call-on-tail f)
  (lambda (x)
    (cons (car x) (f (cdr x)))))
which leaves the head element alone while applying the transformation to the rest.
> ((compose rotate-left reverse (call-on-tail rotate-right) involute)
   '((1 2 3) 4 5 (6 7 (8 9 10))))
((1 2 3) 5 4 (6 7 (8 9 10)))

These functions can move elements up and down the list structure:
(define (leftmost c)
  (if (pair? c)
      (leftmost (car c))
      c))

(define (replace-leftmost c new-value)
  (if (pair? c)
      (cons (replace-leftmost (car c) new-value) (cdr c))
      new-value))

(define (rightmost c)
  (if (pair? c)
      (if (null? (cdr c))
          (rightmost (car c))
          (rightmost (cdr c)))
      c))

(define (replace-rightmost c new-value)
  (if (pair? c)
      (if (null? (cdr c))
          (cons (replace-rightmost (car c) new-value) (cdr c))
          (cons (car c) (replace-rightmost (cdr c) new-value)))
      new-value))

(define (swap-ends l)
  (replace-leftmost (replace-rightmost l (leftmost l)) (rightmost l)))

> (swap-ends '((1 2 3) 4 5 (6 7 (8 9 10))))
((10 2 3) 4 5 (6 7 (8 9 1)))

> ((compose involute swap-ends involute) '((1 2 3) 4 5 (6 7 (8 9 10))))
((1 2 3) 5 4 (6 7 (8 9 10)))

> ((deeply swap-ends) '((1 2 3) 4 5 (6 7 (8 9 10))))
((6 2 1) 4 5 (8 7 (10 9 3)))

> ((compose (deeply swap-ends)
            (deeply swap-ends)
            (deeply swap-ends)
            (deeply swap-ends)
            (deeply swap-ends)) '((1 2 3) 4 5 (6 7 (8 9 10))))
((1 2 3) 4 5 (6 7 (8 9 10)))

There's no real application for all this, except maybe to come up with some puzzles. It's just fun to noodle around with list transformations to see what you can come up with, and to practice your list manipulation skills. You really need a language with a REPL to play around like this. A parsimonious syntax like Lisp helps, too. It would have been a bit more difficult to fool around if I had to put the appropriate commas, curly braces, brackets, and semicolons in just right.

None of these operations work on circular lists, but you can imagine that rotations and reversals could work on fully circular lists, but I'm not sure how they'd make sense on lists with circular tails. It would be challenging to make them work, though. They also don't work on “dotted” lists — they throw an error when they run into the dotted item at the end of the list. But it is fairly easy to imagine how they might be made to work on a dotted list. It would be much less of a challenge to implement.

Saturday, January 11, 2020

Gendl / SBCL / Ubuntu / WSL / Windows and an experiment in live blogging

I'm helping David Cooper by trying to run a demo of his Gendl software. He's preparing a new release and I get to be the guinea pig. For fun, I'm live blogging as we go along just as an experiment.

I'm running SBCL 1.4.5debian under Ubuntu 18.04.3 under Windows Subsystem for Linux (WSL 1) under Windows 10 Home edition. I have to say that Ubuntu on WSL is remarkably stable and useful. It seems to act just like the Ubuntu I'm used to using and runs ELF executables without modification. The GNU tool chain just seems to work and I used apt-get install to fetch and install SBCL, which hasn't complained either. I've git cloned David's release and I'm now awaiting further instruction.

While waiting, I've installed slime and am running SBCL under Emacs 25.2.2. Quicklisp is used to install and start Gendl. This starts up a web server that provides the UI in another thread.

So far, this entire melange of software is working quite smoothly despite the oddball combination of the parts. Lisp has habit of tickling memory protection bugs and threading bugs. Unix isn't supposed to get along with Windows. Windows isn't known to get along with Unix very well, either, unless you isolate them from each other through a virtual machine. But WSL is doing its job acting as a go-between. I don't know the details of how it works, but I do know that you need to have some virtualization enabled, but it isn't a full virtual machine (Windows 10 Home edition doesn't support Hyper-V). In WSL, the Linux side can see the Windows disk as a mount point, but it doesn't seem that the Windows side can see the Linux disk. WSL gives you a bash shell in a window capable of running Emacs, or you can run an XWindows server like Xming under Windows if you want the full X experience. Performance seems reasonably snappy.



Well live blogging didn't work. It just felt rude to be typing and not paying full attention while someone was demonstrating some software that he had obviously worked very hard on. So I'll give a re-cap of what I understood from the demo. I'm no expert on CAD systems and most likely misunderstood important points, so take this as a novice's view. I asked David to help me correct this write-up. Not surprisingly, there's one important point I misunderstood, so I'll put in David's explanation.

Gendl is inspired by ICAD. Through use of CLOS and some macros, Gendl provides a DSL that allows you to design an object through a tree-like decomposition of its component pieces. Each component has properties, and these can be inherited by subcomponents (so, for example, when you paint a component a certain color, the color propagates down the component tree to the child components and they get painted, too).

(Here I messed up majorly.) In David's words
The technical term for the properties is “messages.” In many ways this is a message-passing object system, inspired by Flavors, which was inspired by SmallTalk (The ICAD System was built with Flavors, not CLOS).

Note there are two, orthogonal, concepts of "inheriting" going on. Normal class mixins provide one type of inheritance -- an object definition will inherit all the messages of its mixins. We usually call this "inheritance."

The passing of values from parent instance to child instance and other descendant instances in the instance tree is called “passing down” rather than inheritance, to avoid confusion with that other, different, inheritance. Messages can be passed down implicitly (by making them trickle-down-slots), or passed down explicitly by including them as keyword/value pairs in the object specification.

Another way to think of this is that the class (or “type”) of a given instance can inherit from multiple mixins, can contain multiple child object instances, but can have at most one Parent object in the instance tree (and it can have zero Parent objects if it's the root)

Also:

The mixin inheritance relationship is an “is a” relationship (essentially the same as normal CLOS inheritance).

The parent-child relationship is a “has a” relationship, and comes along with automatic demand-driven (i.e. lazy) dependency tracking.

The base component contains a co-ordinate system, so all components have a co-ordinate system relative to the root object. One primary module or application of Gendl is the web-based UI Geysr, that allows you to navigate the component tree, inspect components, change their properties, and visualize how they fit together. This module also provides an “exploded” view of the object if desired, where each subcomponent is rendered displaced a small distance from it's actual location. It can also render an exploded view of the assembly processs for the object.

The DSL provides a way of specifying a components location via constraint-like functional messages between components. This means that components can get resized when necessary, angles recomputed when necessary, and even the number of subassemblies recomputed dynamically when the object changes. In the “bench” example David showed me, the number of slats in the bench seat was recomputed dynamically from the seat width, the slat width, and constraints on the slat spacing. The user simply specified the perimeter of the seating area and Gendl figured out how many 2x4's you'd need to cover it. (Maybe I'm easily impressed, but I thought this was pretty neat.)

Another part of Gendl is the Process Planning module which computes a Manufacturing Bill of Processes and Raw Materials. This is much more interesting. Again using the DSL provided by CLOS, you define rules on how to build components from their constituent parts. Then Gendl, starting at the root node, infers a construction plan by recursive descent through the components and combining the rules. When it is done, you have the instructions for building the entire assembly. For the bench example, it started with purchase of sufficient 2x4s for the frame, seat, and back, then cutting the 2x4s to length, some at an angle for the reclining back, then fastening the cut pieces together, finally painting the result. The leaf nodes in the process planning tree represent the Bill of Materials for the object under construction. I thought this was pretty cool, too.

While I know nothing about CAD, I do know a little about computer languages, so I'll opine a little on the DSL. The basic operation in Gendl is define-object which is a macro that extends defclass. Objects have slots (fields) that hold values. One feature of Gendl is the ability to define “trickle down” slots whose values are available in the environment of subcomponents. Thus when you attach a new seat slat to the seat of the bench, it can use the trickle down value to match its paint color with that of the rest of the seat. This use of “environmental” properties isn't unique to Gendl, but it is worth note. David says, “Trickle-down slots are essentially a short-hand way of passing messages from a parent down to its descendants.

The recursive nature of Lisp matches well with the recursive decomposition of objects. The DSL makes it obvious to even the casual observer how the objects are composed. The value of the slots in the object can be defined as any Lisp expression, but drawing from the constraint-like language subset makes it obvious how the pieces relate to each other. You don't specify that the arm rests are 60 inches from the center (although you could), you specify that they butt up against the backrest. Given that constraint, you can move one subassembly and Gendl will automatically recompute the positions of the adjoining assemblies. While this is powerful, I suspect you need a fair amount of experience in setting up the constraints to do it so that it is useful.

Something I completely missed because it is natural to me and other Lisp hackers, but not to CAD designers, is the power of having an Emacs/Slime REPL in parallel with the web-based interface. The web-based interface gives you a natural point-and-click, drag-and-drop, UI with the instant visual feedback while at the same time you have the powerful DSL available in a REPL for experimenting directly with the object tree in a natural text-based way. (Sometimes people forget that text itself is highly abstract visual medium with thousands of years of development behind it.) In David's demo, he would move between the REPL and the UI extremely frequently, keeping both windows open at the same time, sometimes manipulating the object graphically, other times textually. In retrospect, this looks like a huge advantage, but at the time it seemed like an obvious way to use the system. I expect other jaded command-line hackers would have the same experience.

The rules for constructing components are written as CLOS methods that are specialized to the component they are constructing. It is rather obvious what rule applies for a given component. In addition it is obvious what to specialize on for constructing a component. The rule files are straightforward and reasonably terse.

Given the shortness of the demo (there was a lot to grasp), and my own inexperience, I don't think I could write an object description from scratch, but given an existing description I feel confident I could add a small subcomponent. I can see how this product would be useful in design and manufacturing and I think the learning curve doesn't look too steep if one is willing to start small before working up to something hugely complex.

Thanks to David for giving me the demo and for helping correct my blog post. I take credit for all the errors and misconceptions.