Sunday, May 26, 2024

Exception Handling for Control Flow

Back when I was taking a Software Engineering course we used a language called CLU. CLU was an early object-oriented language. A feature of CLU was that if you wrote your code correctly, the compiler could enforce completely opaque abstract data types. A good chunk of your grade depended on whether you were able to follow insructions and write your data types so that they were opaque.

CLU did not have polymorphism, but it did have discriminated type unions. You could fake simple polymorphism by using a discriminated type union as the representation of an opaque type. Methods on the opaque type would have to dispatch on the union subtype. Now to implement this correctly, you should write your methods to check the union subtype even if you “know” what it is. The course instructors looked specifically for this and would deduct from your grade if you didn't check.

One subproject in the course was a simple symbolic math system. You could put expressions in at the REPL, substitute values, and differentiate them. The core of the implementation was term data type that represented a node in an expression tree. A term was a type union of numeric constant, a symbolic variable, a unary expression, or a binary expression. Unary and binary expressions recursively contained subterms.

The term data type would therefore need four predicates to determine what kind of term it was. It would also need methods to extract the constant value if it was a constant, extract the variable name if it was symbolic, and extract the operator and subterms if it was a unary or binary expression. The way to use the data type was obvious: you'd have a four-way branch on the kind of term where each branch would immediately call the appropriate selectors. But if you wrote your selectors as you were supposed to, the first thing they should do is check that they are called on the appropriate union subtype. This is a redundant check because you just checked this in the four-way branch. So your code was constantly double checking the data. Furthermore, it has to handle the case should the check fail, even though the check obviously can never fail.

I realized that what I needed was a discrimination function that had four continuations, one for each subtype. The discrimination function would check the subtype and then call the appropriate continuation with the subcomponents of the term. This would eliminate the double checking. The code was still type safe because the discrimination function would only invoke the continuation for the correct subtype.

One way to introduce alternative continuations into the control flow is through exceptions. The discrimination function could raise one of four exceptions, each with the appropriate subcomponents as arguments. The caller would not expect a return value, but would have four catch handlers, one for each subtype. The caller would use the try/except syntax, but it would act like a switch statement.

The TA balked at this use of exceptions, but I appealed to the professor who saw what I was trying to do and approved.

There is some debate about whether exceptions should be used for control flow. Many people think that exceptions should only be used for “exceptional” situations and that it is poor form to use them for normal control flow. I think they are taking too narrow a view. Exceptions are a way to introduce alternative paths of control flow. One can use them to, for instance, handle an exceptional situation by jumping to a handler, but that's not the only way to use them.

Of course you should think hard about whether exceptions are the right way to introduce alternative control flow in your use case. Exception syntax is usually kind of klunky and you may need to rewrite the code to insert the exception handling at the right point. Readers of your code are likely to be surprised or baffled by the use of exceptions for control flow. There is often a significant performance penalty if an exception is thrown. But sometimes it is just the trick.

5 comments:

Josh Ballanco said...

Aside from the (suspect, in my opinion) semantic case of "exceptions are for exceptional situations" argument, I've often heard two different reasons given for avoiding the use of exceptions for control flow: exceptions-as-flow-control is just a form of "goto", and exceptions are to expensive/not optimized by the language. Those who make the former argument have completely missed the point of that famous essay.

The second, argument, bears some consideration, however. I first encountered this argument when using expressions-as-flow-control in Java 1.1.6. If memory serves, at that time in history the effort to construct an exception involved the allocation of surprising amounts of heap space. I also vaguely recall an effort made at some point in the intervening years to dramatically reduce those allocations. The point remains, though, that if a language implementor expects exceptions to be, well, exceptional, they are likely not going to spend as much effort to optimize them as they would, say, a loop construct or a method call.

This is one reason why I have always loved Ruby's `throw`/`catch` construct. At a glance, it seems no different than `begin`/`rescue`/`end` (which is Ruby's version of exception handling). Indeed, `throw`/`catch` uses terms much closer to those used in other languages exception constructs, but it is not exception handling. It is, rather, much closer to what I think we might all agree is the thing you really want:

`call-cc`

Joe Marshall said...

Function calls as control flow is just a form of 'goto' as well, do they object to that?

Josh Ballanco said...

Dijkstra may have been arrogant and misunderstood, but he was rarely wrong. Everyone else, on the other hand...

But yes, I've found in my career that "GOTO considered harmful" is a favorite hammer for programmers to pull out whenever they don't understand a particular nail.

Scott L. Burson said...

A place I used to work at had a large Java program, one module of which was doing a depth-first search over a graph. Under some circumstances (I forget the details) it would want to jump out some number of levels, which it did, of course, by throwing an exception. But it kept a pre-built instance of the exception lying around for the purpose, so it wouldn't have to allocate a new one every time. This worked and AFAIK was reasonably efficient.

Common Lisp catch/throw are pretty fast too, I think. Interestingly, though, I haven't seen them used much; it seems more common to pass down a function closed over a block name, and then use return-from with the block name to jump out. This "lexical throw" is presumably even faster because it doesn't have to scan the stack.

Joe's multiple-continuation CLU trick would be fairly natural in Scheme. I toyed with this idea briefly c.1988, when I was playing around with T, but never did much with it.

(This was 6.170, yes? I take it Liskov herself overrode the TA? Cool!)

Joe Marshall said...

It was 6.170.
Bill Weihl was the lecturer that term. I had Liskov as a recitation instructor for 6.033 and once had a discussion with her in re the shortcomings of CLU.

I always enjoyed the fact that the heavy hitters at the 'tute would be closely involved in teaching the underclassmen.