Tuesday, February 11, 2020

Macro pitfalls

Macros are a unique source of power in Common Lisp, but there are some pitfalls to watch out for.

A compiler macro is special macro that is expanded only by the compiler. The interpreter doesn't expand the macro and simply evaluates the form like a normal function call. If you aren't careful when writing a compiler macro, the interpreted and compiled forms may not evaluate the same and that's probably not what you want. Here we abuse this effect
(defun foo (x) 'interpreted)

(define-compiler-macro foo (x) ''compiled)

CL-USER> (foo)
INTERPRETED

CL-USER> ((lambda () (foo)))
COMPILED
That might be unexpected. It appears that in this implementation (SBCL) the compiler is called on lambda expressions when they are evaluated.

Like all macros, a compiler macro is given the unevaluated source code of the arguments, not the value. We can see that in this example
(defun foo (l r)
  (format t "~%Foo")
  (list r l))

(define-compile-macro foo (l r) 
  `(progn 
     (format t "~%Foo")
     (list ,r ,l)))

CL-USER> (foo (progn (format t "~%First") 'l) (progn (format t "~%Second") 'r))

First
Second
Foo
(r l)

CL-USER> ((lambda () (foo (progn (format t "~%First") 'l) (progn (format t "~%Second") 'r))))

Foo
Second
First
(r l)
When interpreted, the arguments are evaluated left to right before the function is entered. When compiled, the arguments end up being evaluated right to left and after the function is entered.

Unless you really want this — and shame on you if you do — you have to be careful when writing your macro to preserve the left-to-right, call-by-value semantics that are probably expected. The easiest way to do this is to write the expansion so that it just substitutes the body of the function. Something like this
(define-compiler-macro foo (l r)
  `((lambda (l r)
      (format t "~%Foo")
      (list r l))
    ,l
    ,r))

CL-USER> (foo (progn (format t "~%First") 'l) (progn (format t "~%Second") 'r))

First
Second
Foo
(r l)
Or you could use a let expression with the same effect
(define-compiler-macro foo (l r)
  `(let ((l ,l)
         (r ,r))
     (format t "~%Foo")
     (list r l)))
The version with the lambda expression doesn't even require putting a block of let bindings at the front. You just plop down the original argument list and body after the lambda, but both forms are equivalent.

The problem with doing this is that you have probably disabled the ability of the compiler to optimize the expression. You are forcing the compiler to ensure that the arguments are evaluated in left-to-right order before the body. A Sufficiently Smart compiler might be able to provide some optimizations anyway. If your compiler is not Sufficiently Smart, you can take matters in to your own hands and substitute the arguments at the point they are used. Just be aware that you might be surprising people by changing the semantics at the call site.

Funny semantics isn't just a problem with compiler macros. Regular macros have to be written with care as well or you may surprise users when they write code they think are normal function calls. Compiler macros just have the unique property that they can change the semantics between interpreted and compiled code.

You can see a related effect when using symbol macros. A symbol macro substitutes a piece of code that computes a value. If we write
CL-USER> (let ((l (progn (format t "~%First") 'l))
               (r (progn (format t "~%Second") 'r)))
           (format t "~%Let body")
           (list r l))

First
Second
Let body
(r l)
we get the standard left-to-right, call-by-value evaluation. But we can mimic normal-order reduction by substituting the code for l and r before evaluating the body of the let by use of symbol-macrolet*
CL-USER> (symbol-macrolet ((l (progn (format t "~%First") 'l))
                           (r (progn (format t "~%Second") 'r)))
           (format t "~%Symbol-macrolet body")
           (list r l))

Symbol-macrolet body
Second
First
(r l)
If one of the arguments to a macro is a block of code, for instance the &body argument, then you probably want to avoid accidental variable capture.
(defmacro capturing-macro (&body body)
  `(let ((temp 'captured))
     (format t "~%Macro body binds temp to ~S" temp)
     ,@body))

(let ((temp 'lexical))
  (capturing-macro
     (format t "~%Temp is ~s" temp)))

Macro body binds temp to CAPTURED
Temp is CAPTURED
NIL
The lexical binding of temp is shadowed by the binding introduced by capturing-macro. This is probably unintended (except in the case of anamorphic macros, where capture is intended). Instead, you can ensure lexical scoping is maintained by closing over the body before introducing any new bindings
(defmacro non-capturing-macro (&body body)
  `(let ((temp 'captured)
         (body (lambda () ,@body)))
     (format t "~%Macro body binds temp to ~S" temp)
     (funcall body)))

(let ((temp 'lexical))
  (non-capturing-macro
    (format t "~%Temp is ~s" temp)))

Macro body binds temp to CAPTURED
Temp is LEXICAL
NIL
In this case, even a fairly naive compiler ought to be able to inline the call to body because it is simply a lexically apparent code block.

Inadvertent capture can happen in other direction as well if the macro caller shadows a binding used by the macro.
(flet ((funcall (x) (format t "~%Unexpected")))
  (let ((temp 'lexical))
    (non-capturing-macro
      (list temp))))

Macro body binds temp to CAPTURED
Unexpected
NIL
Here the caller shadowed funcall and the code the macro introduced ends up inadvertently calling it. This doesn't happen often in practice because people rarely shadow the top-level functions a macro depends upon, and that is good because there isn't an easy way to solve this reverse capture problem (other than don't do that).

The “hygienic” macro system in Scheme solves both kinds of accidental capture by appropriately renaming variables. There is a price, however. You either have to forego direct code manipulation and use a special pattern matching language, or write code that explicitly keeps track of the environment where the variables are bound. For simple macros, the pattern matching language is adequate, but for more complex macros, neither option is appealing.

*Macrolet rhymes with Chevrolet, naturally.

4 comments:

John Cowan said...

In SBCL there is no interpreter in the classic sense: EVAL basically invokes the compiler.

Joe Marshall said...

Apparently. Except top-level compiler-macros are not expanded:
(eval '(foo))
INTERPRETED

(eval '((lambda () (foo))))
COMPILED

Anonymous said...

http://www.sbcl.org/manual/#Interpreter

Rommel MARTINEZ said...

Thanks for the article. I found some minor typos:

1. There should be no required arguments in the definitions of FOO:

(defun foo (x) 'interpreted)
(define-compiler-macro foo (x) ''compiled)

should be

(defun foo () 'interpreted)
(define-compiler-macro foo () ''compiled)


2. The name of the macro is incorrect:

(define-compile-macro foo (l r) `(progn (format t "~%Foo") (list ,r ,l)))

should be

(define-compiler-macro foo (l r) `(progn (format t "~%Foo") (list ,r ,l)))


3. The symbols returned should be uppercase due to the unchanged readtable case. The same applies with the other values:

Foo
Second
First
(r l)

should be

Foo
Second
First
(R L)