Saturday, September 26, 2009

If you haven't heard

There seem to be more readers of my posts than there used to be, so I wanted to mention a project I've been slowly working on.

I've been porting MIT Scheme to the .NET CLR. MIT Scheme has two major components: the ‘microcode’ which provides the memory model and the primitives, and the ‘runtime’ which is the Scheme code and libraries that come with the system. Once upon a time, the ‘microcode’ really was microcode. The Scheme-81 Chip (sorry, I can't find the text on line) was a microcoded VLSI processor that interpreted the SCode representation of Scheme. Sometime around 1982, the microcode was translated to 68000 assembly code and used to run Scheme on the HP 9836 Chipmunk. Sometime around 1983 or so, Jim Miller wrote an SCode interpreter in C that replaced the microcode layer. My version is a new SCode interpreter in C#. The source code is available at http://code.google.com/p/jrm-code-project/.

I'm doing this for fun and to try out different ideas I've had about interpreter implementation. The interpreter basically works, but it is missing a lot of primitive procedures and I haven't hooked up the error handler yet. It is, of course, a lot slower than the standard version of MIT Scheme (which is mostly compiled), but it is in the ballpark of MIT Scheme on interpreted code.

So why am I bringing this up? Lately I've been posting about first-class environments. MIT Scheme has first-class environments, and it uses them to provide different ‘packages’ for the Scheme code (a package is separate namespace in which you can run REPL). MIT Scheme also has a compiler that can produce very efficient code in the right circumstances. There is a caveat, though. If you make use of first-class environments, the compiler is forced to create environment structures that are compatible with the interpreter's representation. This is because there may be a need to call in to the interpreter to eval an arbitrary expression in such an environment. In these circumstances, the compiler can perform essentially minor optimizations that do very little to speed up the code. MIT-Scheme only uses first-class environments — actually, I'm going to call them ‘interpreter compatible environments’ because that's more accurate — to construct the runtime packages. There are on the order of 50 of them. The rest of the code doesn't use them at all.

As I have mentioned in earlier posts, the two things that an interpreter does most is variable lookup and continuation management. These things are critical to interpreter performance. I managed to get two academic papers published on how to get the underlying system to manage continuations efficiently, yet still be able to reify them to support first-class continuations. Right now I'm working on improving variable lookup.

When an interpreter compatible environment is used, there is little choice but to represent the environment as a linked list of frames. Each frame can use a vector representation for the variables that are bound by lambda application, but each frame also needs the ability to incrementally add bindings should the user evaluate a ‘define’ expression or invoke load from one of these environments. When evaluating a variable, the interpreter walks the environment chain and scans each frame until it finds the closest binding with the correct name (this is called a ‘deep search’). Needless to say this can cause serious performance degradation. A number of ways have been devised to avoid this problem. One that is of interest is the idea of computing a fixed lexical address for a variable. In most circumstances, the environment chain always has the same shape and the variable always appears at a fixed location. It is faster to dereference the chain until the correct frame is found and then pull the value of the variable from the vector. Unfortunately, it can be incorrect to do so. If an intervening frame is interpreter compatible, the user might load a file or define a variable that shadows the lexical binding. If this happens, the shadowing variable should be used rather than the former one at the fixed location. In Free variables and first-class environments, Miller (see above) and Rozas describe how MIT Scheme deals with shadowed variables. In essence, when you introduce a shadowing binding, you walk the rest of the environment chain and mark the shadowed variables. The interpreter uses the fast lookup method, but checks for the mark. If the mark is absent, the correct variable was found. If the mark is present, the interpreter does a deep search to find the variable. It sounds a lot simpler than it is. Because the lexical environment structure forms a tree, it is possible to have variables that are shadowed along one branch, but unshadowed on the other. It is possible to have variables that are shadowed at different depths on different paths. These are not often encountered (in the mid 80's, long after they thought they had covered every possible case of shadowing, Rozas discovered a new way to fool the variable lookup mechanism into fetching the wrong variable). If it did happen, the error message “Broken Compiled Variable -- get a wizard” would appear and Scheme would halt.

In my attempted port of Object Lisp, I would get this all the time. The idea was to inject the object bindings into the lexical environment when invoking an object method. This, of course, caused shadowing. But upon leaving the object method, I would remove the shadowing bindings. Upon invoking a method on a different object, a different set of bindings would be injected. The variable lookup mechanism would get confused because it would find the shadowing mark and try to locate the actual binding, but the next time it tried it would find a shadowing binding, but for a different variable. At this point I realized what a mess this was and decided that my approach wasn't going to work.

When the Scheme compiler became stable enough that the majority of code is expected to be run in compiled mode, it was considered too much of a burden to try to maintain the shadowed variable tracking mechanism and it was removed. MIT Scheme now has a slower, but far simpler interpreter. If you want fast, use the compiler.

Ok, back to my version of MIT-Scheme. For interpreter compatible environments, there is little choice but to deep search on every variable reference. But MIT Scheme only has about 50 or so of these and the vast majority of the code does not need to implement environments in this way. If we assume that incremental definition can only occur in the special interpreter compatible frames, then we can change the environment representation to make variable lookup faster. One popular mechanism is the ‘flat environment’ representation. In this representation a closure does not contain a pointer to a parent environment, but rather it contains a flat vector of pointers to the appropriate value cells for the lexical variables it uses. This makes a tradeoff. When creating a lexical closure, we have to copy a potentially large set of pointers to the lexical bindings, but when we look up a lexical variable, we only need to index into the lexical vector and dereference the pointer we find. Empirically, it seems that the tradeoff is largely worth it.

So I've changed my version of MIT Scheme to use flat environments rather than linked frames for the case where first-class environments are not used. It has been rather tricky, though, and I have a lot of scaffolding in place to double check everything. I'm now carefully removing the scaffolding and cleaning up the code to see if I have a net improvement in performance.

No comments: