Wednesday, January 1, 2025

Inlinig Primitive Function Calls and Argument Evaluation

Inlining some primitives

Reconsider our model of a Lisp program as a “black box” that issues a series primitive function calls. We can eliminate some of the primitive function calls by implementing them directly within our “black box”. We inline some primitives.

Take null? as an example. Instead of constructing (null? arg) as

[primitive-funcall1
  [quote [primitive null?]]
  [argument 0]]

we add a new S-code construct, primitive-null?, and construct (null? arg) as

[primitive-null?
  [argument 0]]

We don't have to evaluate the function. We don't even have to jump to the entry point of the null? primitive. After we evaluate argument 0, we just test for null in the interpreter and return T or NIL as appropriate.

There are maybe 20 or so primitives that are used frequently enough to be worth inlining in this way. Each primitive you inline like this adds bloat to the interpreter.

Inlining simple arguments

The leaves of a tree of S-code are the atomic expressions, whereas the internal nodes of the tree are compound expressions. We can eliminate the leaves by inlining them into their parent nodes. For example if a leaf node is a lexical variable reference, we inline this into the parent node. We unroll the evaluation of the leaf node thus saving a recursive call to the interpreter and an evaluation dispatch.

Consider our previous example which we consructed as

[primitive-null?
  [argument 0]]

We further specialize primitive-null? based on its argument type into primitive-null?-argument or primitive-null?-lexical. Now our S-code becomes:

[primitive-null?-argument 0]

The leaf node [argument 0] is absorbed into the parent node [primitive-null? ...] making a new leaf node [primitive-null?-argument]. The evaluator for this S-code node simply tests if argument 0 is null and returns T or NIL as appropriate.

Compare this to the original S-code:

[funcall
  [global 'null?]
  [argument 0]]

This required two recursive calls to the interpreter, a global lookup, and a primitive function call on top of the null test. We’ve eliminated all of those. There’s not much left to do. Testing null? in the interpreter is almost as fast as testing null? in compiled code.

The number of S-code types needed to perform this inlining is the number of kinds of leaf nodes raised to the power of the number of leaves in the parent node. A call to a one-argument primitive would need specializations for the cases where the argument is a quoted value, an argument, a lexical variable or a global variable — four specializations. Calls to a two-argument primitive turn into one of sixteen specializations — the product of four for each argument. A call to a three-argument primitive would turn into one of sixty-four specializations.

We can inline all the non-compound argument evaluations, both to primitive functions and to user-defined functions. In our S-code tree, we have removed all the leaf nodes and absorbed them into their parent nodes (which have now become new leaves). The interpreter is now quite a bit faster, although still not as fast as compiled code.

No comments: