## Friday, October 9, 2009

### To make a long story short

I was mulling over a bunch of ideas about benchmarking, experimentation, and the philosophy of science. You'll be happy to hear that I'll spare you the details. The end result is this: The sboyer benchmark when given a argument of 1 (591777 rewrites) runs with a median time of 6.596 seconds on jrmscheme. No optimizations are turned on, but the code has been run through SF with `(declare (usual-integrations))`. This is pretty lackluster, but there is no place to go but up!

So let's start with the optimizations. I have a lot of code that instruments the interpreter when I run in debug mode. According to the instrumentation, from the initial REPL prompt, through loading and running the benchmark, printing the results and halting at the next prompt, there are 102,320,483 evaluations. The breakdown is this:
```Variable                39028733
PrimitiveCombination1   20796045
Conditional             13248184
PrimitiveCombination2    6294095
Quotation                6148442
Combination2             4155507
Combination1             3576004
Sequence2                2572345
StandardLambda           2313932
Assignment               1666782
Combination              1420518
Disjunction              1087864
PrimitiveCombination3       6873
Sequence3                   4451
Access                       388
PrimitiveCombination0        295
Definition                    22
Comment                        2
StandardExtendedLambda         1
```
Variable lookup is the number one item, and optimizing it will help tremendously.

If we keep track of the lexical depth of the variables (that is, how many environment frames we had to search), we find this distribution:
The vast majority of the variables are found in the topmost frame.

Recall the code for variable lookup:
```object DeepSearch (Symbol name)
{
int offset = this.envClosure.FormalOffset (name);
if (offset == -1) {
if (this.incrementalDefinitions.ContainsKey(name))
return this.incrementalDefinitions[name];
else
return this.envClosure.Environment.DeepSearch (name);
}
return this.bindings [offset];
}
```
Most of the time, the `offset` is a non-negative index into the binding array. But the offset is determined by the position of the name in the lambda expression that binds the variable, and that can be determined before we evaluate the code. So we'll introduce a new SCode type `Argument` that is the same as `Variable` except that it also contains the offset. When we construct a `lambda` expression, we'll walk the scode body and replace references to the lambda parameters with `Argument` structures. We only replace the references if they occur at the same binding level. If there is an intermediate lambda expression, we cannot replace the references. We can then add an `ArgumentValue` method to our environment objects:
```object ArgumentValue (int offset)
{
return this.bindings[offset];
}
```
The `EvalStep` of a `Variable` looks like this:
```public override bool EvalStep (out object answer, ref Control expression, ref Environment environment)
{
return false;
}
```
The `EvalStep` for an `Argument` will be like this:
```public override bool EvalStep (out object answer, ref Control expression, ref Environment environment)
{
return false;
}
```
There is one more change. If the variable is not an `Argument` then we know one thing for sure — we won't find it in the bindings of the topmost environment. It may be added later as an incremental, but it certainly won't be in the formal parameters and it is pointless to search there. We'll save time by avoiding that search:
```object NonArgumentValue (Symbol name)
{
// Skip the formals of this frame.
if (this.incrementalDefinitions.ContainsKey(name))
return this.incrementalDefinitions[name];
else
return this.envClosure.Environment.DeepSearch (name);
}

object DeepSearch (Symbol name)
{
int offset = this.envClosure.FormalOffset (name);
if (offset == -1) {
if (this.incrementalDefinitions.ContainsKey(name))
return this.incrementalDefinitions[name];
else
return this.envClosure.Environment.DeepSearch (name);
}
return this.bindings [offset];
}
```
Basically we unrolled the loop by one and tossed out the dead code.

These two tiny changes have a big effect on the running time. Our median time is now 5.488 seconds, or 0.832 times the original time.

We're by no means done with this sort of optimization, but I'll get to that soon...