In some languages, like Lisp and OCaml, every language construct is an expression that returns a value. Other languages, like Java or Python, have two kinds of language constructs: expressions, which combine compositionally and which have return values, and statements, which combine sequentially and which have no return values and thus must operate by side effect. Having statements in your language needlessly makes things more complicated, but language designers seem to want to go much further and add complexity that just seems capricious.
You cannot usually use a statement in a context where an expression is expected because there is no return value. You can use an expression where a statement is expected by simply discarding the return value. This means there are two kinds of contexts. A sane designer would provide a way to switch between statement and expression contexts, but language designers typically omit these.
Language constructs, such as binding or iteration, must be provided as statements or expressions or both. Language designers seem to randomly decide which constructs are statements and which are expressions based on whims and ease of compiler implementation. If you need to use a construct, but aren’t in the right kind of context, you need to switch contexts, but without a way to do this, you may have to rewrite code. For example, if you have a subexpression that could raise an exception that you want to handle, you’ll have to rewrite the containing expression as a series of statements.
A sane language designer would use the same
syntax for the same construct in both expression and statement form,
but typically the construct will have a very different syntax.
Consider sequential statements in C. They are terminated by
semicolons. But sequential expressions are separated by commas.
Conditional statements use if/else
, but conditional expressions
use ?:
. When refactoring, you cannot simply move code between
contexts, you have to change the syntax as well.
Writing a function adds a new expression to the language. But function bodies aren’t expressions, they are statements. This automatically destroys referential transparency because you cannot substitute the function body (statements) at the call site (expression context).
try/catch
is a nightmare. Usually you only
get try/catch
as a statement, not an expression, so you
cannot use exception handling in a subexpression. You cannot
get a value out of a try/catch
without a side
effect. Throw
, on the other hand, throws a value, so
it could throw to an expression context, except
that catch
is a statement.
Having two kinds of language constructs and two different contexts in which you can use some of them but not others, and different syntaxes depending on the context just makes it that much more difficult to write programs. You have to keep track of which context is current and which syntax to use and constantly switch back and forth as you write a program. It would be easy for a compiler to introduce the necessary temporaries and rewrite the control flow so that you could use all language constructs in either context, but language designers don’t bother and leave it up to the programmer to do it manually.
And all this mess can be avoided by simply making everything be an expression that returns a value.