van Rossum suggests that tail recursion may be difficult for the compiler if function redefinition is permitted. (Of course we both agree that doing this is not good style. Furthermore, we both agree that it nevertheless ought to ‘work’ in an intuitive way.) He gives this example:
def f(x): if x > 0: return f(x-1) return 0 g = f def f(x): return x g(5)and explains
The call to g(5) invokes the earlier f, but the “recursive” call no longer recurses! At run-time, the name ‘f’ is rebound to the later non-recursive definition, so the returned value is 4, not 0.The direct translation into Scheme is this:
(define f (lambda (x) (if (> x 0) (f (- x 1)) 0))) ;; Prove it is tail recursive. (f 1000000) => 0 ;; Mutate the definition of F. (define g f) (set! f (lambda (x) x)) ;; Now what happens? (g 5) => 4This will work in any version of Scheme. The compiler doesn't need to analyze the target of the function call, it only needs to notice that the caller's stack frame is no longer in use.
What about debugging?
The most common objection to tail recursion is that it makes debugging more difficult because the stack backtrace is missing the frames from the tail call positions. This is a real, pragmatic concern that I think is worth addressing.
MIT-Scheme has an approach that seems promising. At each evaluation step the interpreter records the program state in a structure called the ‘history’. The history essentially contains a copy of the stack frame. (Wait!!! I can hear the objections already and I haven't even posted this!) Obviously, this would have the exact same effect on the space utilization as the stack would in a non-tail-recursive implementation except for one very important thing: the history is a ring buffer with a limited size. When a tail recursive call is recorded in the history, each frame in the history moves down by one. If a sufficient number of calls are made, the oldest record is deleted.
Recording the stack frame in the history does change the space complexity, but in a more benign way than simply foregoing tail recursion altogether. The asymptotic space utilization is changed by a constant factor. Since the oldest frames are dropped from the history, the total storage consumed by the history cannot grow without bound. This is sufficient to avoid changing the asymptotic space complexity class. (Actually, I haven't seen a formal proof of this, so there may be some corner cases. Bear with me for a moment.)
In MIT-Scheme, the history is implemented as a ring buffer of ring buffers. Imagine that you have discovered the rib cage of an animal that was shaped like an inner tube. When a non-tail recursive call is made, we rotate the entire inner tube to get to the next rib. When the non-tail call returns, we rotate it back. When a tail call is made, we rotate the current rib and record the stack frame. The effect is this: if we have, say, 5 ribs, then the top five stack frames will have a backtrace of the tail calls that occurred at that level. If each rib is, say, 10 elements long, then each of the top five stack frames will show up to 10 tail calls.
This sounds much more complicated than it really is. The end result is that when you have an error and enter the debugger, you can get pretty much most of the expected stack trace.
But what if that's not enough?
The size of the history can be controlled by the user. If you need a bigger stack trace, just make the history deeper or wider.
But what about the overhead and the extra complexity?
As it turns out, the description sounds worse than reality. If you implement the ’backbone‘ of the history as a doubly-linked list, then rotating the history one way or the other simply involves dereferencing a pointer. And if you keep each ‘rib’ of the history as a doubly linked list as well, each tail call simply involves dereferencing a pointer as well.
I think a simple demonstration might be in order here. This is a buggy version of the Ackermann function:
(define (ack m n) (cond ((= m 0) (/ 1 0)) ; deliberate error ((= n 0) (ack (- m 1) 1)) (else (ack (- m 1) (ack m (- n 1))))))And here is what happen when we try to run it
(ack 2 2) Division by zero signalled by /. S0 (ack m (- n 1)) R0 (ack (- m 1) (ack m (- n 1))) R1 (if (= n 0) (ack (- m 1) 1) (ack (- m 1) (ack m (- n 1)))) R2 (cond ((= m 0) (/ 1 0)) ((= n 0) (ack (- m 1) 1)) (else (ack (- m 1 ... R3 (ack (- m 1) 1) R4 (if (= n 0) (ack (- m 1) 1) (ack (- m 1) (ack m (- n 1)))) S1 (ack m (- n 1)) R0 (ack (- m 1) (ack m (- n 1))) R1 (if (= n 0) (ack (- m 1) 1) (ack (- m 1) (ack m (- n 1)))) R2 (cond ((= m 0) (/ 1 0)) ((= n 0) (ack (- m 1) 1)) (else (ack (- m 1 ... R3 (ack m (- n 1)) S2 (ack m (- n 1)) R0 (ack (- m 1) (ack m (- n 1))) R1 (if (= n 0) (ack (- m 1) 1) (ack (- m 1) (ack m (- n 1)))) R2 (cond ((= m 0) (/ 1 0)) ((= n 0) (ack (- m 1) 1)) (else (ack (- m 1 ... R3 (ack 2 2)The frames labeled S0, S1, and S2 are ‘subproblem’ frames that result from the non-tail-recursive calls. The frames labeled R0, R1, etc. are the ‘reduction’ frames that result from the tail recursive calls. In this case, the history contains everything that a stack trace would contain. Again, if this is not sufficient for debugging, both the depth and width of the history are under user control.
Keeping a history structure may or may not be the best way to deal with the loss of debugging information that comes from having proper tail recursion. This feature has been in MIT Scheme since the early 80s (possibly the late 70s). It isn't very difficult to implement. My description of it is probably larger than the code itself.
I think this is a reasonable counter to the problem of loss of debugging information.