Thursday, September 17, 2009

Yet more rambling

I hope I'm preaching to the choir when I say that procedural abstraction is the greatest thing since sliced bread. If you're still a skeptic, let me point you at these: Now of course there are a huge number of other worthy abstractions, but since procedural abstraction is universal, you can model them with procedures. (Yes, there are other universal models you could use. No, I'm not suggesting we discard the other abstractions and implement them with procedures.) I'm simply pointing out that procedures are a very nice, powerful, and universal abstraction.

Let's break them.

Why would I want to break abstraction? Isn't that a bad thing? Well, yes, it is. I think it is a complete disaster, and I hope you do, too. But I've noticed a few people that seem to think that maybe a little bending — maybe a tiny bit of tweaking — might be ok if you were able to get something for it at the end of the day. They are wrong.

One reason that procedures make for good abstraction barriers is that they are opaque. As a caller, you cannot see how the procedure is written or how it performs its work. You get to supply the arguments, it gets to return an answer and that's it. The barrier works the other way, too. The procedure you call cannot get its hands on the guts of the caller, either. This wasn't always the case. Some early Lisp dialects were dynamically scoped (but Lisp 1.5 had static binding!). The bindings of the caller were visible in the callee. There were several people that pointed out that this was not a good thing. Suppose you have this code:
;; In a library somewhere:

(defun lib-mapcar (f list)
  (if (null list)
      (cons (funcall f (car list))
            (lib-mapcar f (cdr list)))))
But later we get a bug report:
(lib-mapcar #'(lambda (x) (+ x 1)) (cons 1 2))
;; This crashes!
Obviously you shoudn't call lib-mapcar on an improper list, but crashing is less desirable than an informative error message, so Louis Reasoner is tasked with fixing lib-mapcar. It takes him a while, but he comes up with this:
(defun lib-mapcar (f list)
  (cond ((consp list)
         (cons (funcall f (car list)) 
               (lib-mapcar f (cdr list))))
        ((null list) '())
        (t (error "improper list"))))
In the code review, Alyssa P. Hacker suggests logging the calls to f, so Louis does this:
(defun lib-mapcar (f list)
  (cond ((consp list)
         (let ((h (car list)))
           (log "Calling " f " on " h)
           (cons (funcall f h) 
                 (lib-mapcar f (cdr list)))))
        ((null list) '())
        (t (error "improper list"))))
The code passes the regression tests and they ship it.

Two days later, a frantic call comes in from Larry Liverless Labs. Their physics package has mysteriously stopped working. They describe the bug:
(let ((h 6)  ;; approx. Planck's constant
      (c 3)) ;; approx. C
     #'(lambda (lam) (/ (* h c) lam))
    '(3 1 4 1 6)))
;; Expected answer:  (6 18 4 18 3)
=> (3 9 2 9 1)
The problem, obviously, is that the variable h that Louis introduced is shadowing the variable h in the physics package.

“I'll just change it, then. Can I get a list of all the variables used by all our customers so I know which ones to avoid? Oh, and we should document all the variables we declare so that future customers won't accidentally use one.”

The solution is to use lexical scoping because it hides the details of the implementation.

Now I'm not saying that dynamic scoping is bad or wrong, I'm saying that it is the wrong default. Suppose there is a dynamic variable that controls the logging level:
         (let ((h (car list)))
           (log logging-level "Calling " f " on " h)
If there are only a handful of these, we could document them all. Or we could invent a naming convention that makes it obvious that we expect it to be dynamic (add asterisks around it, for example), or we could add special forms for establishing and referring to dynamic variables (like fluid-let or parameter-value). But we really want it to be the case that when we send a junior programmer like Louis Reasoner in to look at the code, he can assume that local changes have local effect and that as long as he computes the correct value, he'll be ok.

to be continued...


Unknown said...

Reading the endless debates on hygienic macros and thinking about the matter I came to the conclusion (it was indeed more a feeling than a conclusion) that hygiene is the right thing in a lexically scoped language. Anything else is inconsistent.

Your posting gave a name to this discomfort, namely "abstraction breach". Thank you very much.

In a similar vain, I also have the "feeling" that having two namespaces does not match with first class functions, but I confess that I have not thought much about this another discomfort.

BTW, in some of your examples above there calls to `my-mapcar'. I guess you meant `lib-mapcar'.

Joe Marshall said...

Hygiene is definitely the right thing, but I think the jury is still out on the best way to achieve it. All the proposed mechanisms — syntax-rules, syntax-case, explicit renaming, syntactic closures, reverse syntactic closures, etc. — seem to be harder to use than they ought to be.

Thanks for pointing out the `my-mapcar' bugs. I fixed them. I had tested the code in Emacs lisp, but I decided to change the names when I copied it to the blog and built a little story around it.