Monday, September 21, 2009

Closing in

In the previous posts I showed some problems that occur when you poke holes in the abstraction you get from procedures. By default, lexical scoping is opaque and you have to write extra code to poke through the holes. But is there any advantage to procedural abstraction and lexical scoping beyond the fact that it is a reasonable default?

The big advantage is that procedural abstraction allows you to separate use from implementation. We don't need to know how a procedure accomplishes what it does, we just need to know what the result should be. On the other side of the barrier, we don't need to know how where the arguments came from, or what the result is used for, we just need to compute it. Presumably, the more efficiently the better. Now let's return to Louis Reasoner. He's just written a sorting routine:
(define (lib-sort list <)
  (cond ((pair? list)
         (let ((first (car list)))
           (do ((before '() (cons (car after) before))
                (after (lib-sort (cdr list) <) (cdr after)))
               ((or (null? after) (< first (car after)))
                (append (reverse before)
                        (cons first after))))))
        ((null? list) '())
        (else (error "foo"))))
It occurs to him that maybe that call to reverse could be a bottleneck, so he instruments it with the code from the last post:
(define lib-sort
  (let ((reverse-counter 0))
    (register-counter! reverse-counter)
    (lambda (list <)
      (cond ((pair? list)
             (let ((first (car list)))
               (do ((before '() (cons (car after) before))
                    (after (lib-sort (cdr list) <) (cdr after)))
                   ((or (null? after) (< first (car after)))
                    (set! reverse-counter (+ reverse-counter 1))
                    (append (reverse before)
                            (cons first after))))))
            ((null? list) '())
            (else (error "foo"))))))
;Value: lib-sort

(lib-sort '(3 1 4 1 5 9 2 6 5 3 5) <)
;Value 20: (1 1 2 3 3 4 5 5 5 6 9)

((cadr (assq 'reverse-counter *counters*)))
;Value: 11
But Louis is called away before he can go much further down this path. He gives the rest of his tasks to his intern.

The intern has to write a program that, given a list, sorts it and returns a pair where the car is the sorted list, and the cdr is the length of the sorted list. That's trivial:
(define (sort-and-length list <)
  (let ((s (lib-sort list <)))
    (cons s (length s))))
But it occurs to him that this is less efficient than it could be. The call to length has to traverse the entire list, and presumably the call to lib-sort must as well. In order to cut down on the number of list traversals, the intern takes a look at the code for lib-sort. It is rather baffling to him (he is an intern), but he figures out that since reverse is called on every recursive call, the number of calls to reverse has to equal the length of the list. So he codes up this monstrosity:
(define (sort-and-length list <)
  (let* ((c (assq 'reverse-counter *counters*))
         (start ((cadr c)))
         (s (lib-sort list <)))
      (cons s (- ((cadr c)) start))))
Time passes...

It turns out a customer is complaining that the code is too slow. A quick test shows that he is trying to sort a list of ten thousand elements and it is spending all its time in lib-sort.

“What idiot wrote this?!” asks Cy D. Fect. “There is an FFI to qsort.” Cy replaces the sort routine:
(define (lib-sort list predicate)
  (vector->list (ffi-qsort (list->vector list) predicate)))
Of course he removed the code that tracks calls to reverse because qsort doesn't use it. When he checks in the code, lib-sort is much, much faster, but for some reason all the menus in the GUI now only contain a single entry. Cy calls in Ben Bitdiddle for help. Ben notices that the GUI calls sort-and-length for each menu, and sort-and-length is reporting that each menu has zero entries. He fixes sort-and-length to do the obvious thing and everything is off and running again. Ben shakes his head and sighs.

One of the most important points of procedural abstraction is that it allows you to change the implementation of the procedure at will without having to analyze the entire code base. We saw before that if we allow the internal variable names to escape (by using them as keys), then we can no longer change the names. In this case, we're going further. We want to eliminate the name altogether because we're changing the entire algorithm. The variable `reverse-counter' won't even exist anymore in this code. By exposing it in this way, we made it possible for an unplanned dependency to be added.

In this example, the unplanned dependency was rather idiotic. That's not the point. I have run into this sort of bug many, many times where an abstraction is not fully documented (or not well documented) and a programmer misunderstands the API and uses some internal function for the wrong purpose. Things work fine until the implementation changes, then very weird unrelated things start to break. Sometimes the code is so tangled that you have to emulate the effect of the old implementation just to keep from having to rewrite huge swaths of the unrelated code.

3 comments:

mquander said...

Thank you very much for this interesting series of posts.

Pascal Costanza said...

Hm, what exactly are you saying?

Imagine Louis Reasoner hadn't added the reverse-counter to lib-sort. However, the Scheme dialect in use is very modern and uses CLOS-style generic functions heavily. Now, the intern realizes that reverse is actually also a generic function. So he comes up with this 'solution':

(define get-reverse-count #f)

(define-method reverse :after
  (let ((reverse-count 0))
    (set! get-reverse-count
            (lambda () reverse-count))
    (lambda (list)
      (set! reverse-count (+ reverse-count 1)))))

(define (sort-and-length list <)
  (let* ((start (reverse-count))
           (s (lib-sort list <)))
    (cons s (- (reverse-count) start))))

Are you saying that generic functions are also bad?

Joe Marshall said...

I'm assuming you mean to call `get-reverse-count' in the sort-and-length function.

I see what you're saying here.

I think I'll argue that my original example had two serious problems. The first is that it exposed the internal name of the counter variable. The second is that someone used this as a communication backchannel that created a dependency. Your example demonstrates the latter problem via a different mechanism, but I'll note that it actually avoids the former one.


On the other hand, I think you'll agree that no matter how the backchannel dependency is constructed, it leads to extremely brittle and hard to understand code.


I think I'll have to think about this some more and post a follow-up. I'm a real fan of generic functions, though. On the other hand, I think `with-slots' is pretty nasty.