Monday, January 31, 2011

Replies to comments

steck said...
Haskell type classes give a consistent interface to different kinds of numbers (among other things). Would that do here?

It would solve the problems with the built-in number hierarchy. Does Haskell allow you to add user defined subtypes to the built-in number types?

Jevon said...
Are you saying that we shouldn't develop using design patterns? No, there's nothing wrong with the pattern per se. I'm saying that if we have programming concept that is important enough and ubiquitous enough that we can identify it and name it (for example, “The Factory Pattern” or “Singletons”), then there should be a way of abstracting it so that the pattern is replaced by something much more lightweight, or, if possible, nothing at all.

You can't argue against Factory objects using Factorial as an example. That's like arguing against using Windows when you're using Notepad as an example.
Notepad might be an effective example for certain types of arguments. Factorial obviously shouldn't be encapsulated and turned into a first-class object, it's absurd. But the language does require that it be encapsulated in some kind of object, nonetheless.

Michael said...
Java is verbose, way too verbose in areas that should be concise.
True, but I'm not necessarily complaining about the verbosity. I'm complaining that there is no reasonable way to avoid, eliminate, or abbreviate the verbosity.

Hardware Critic said...
His point is that Java has created a culture where developers over-generalize (violating the en vogue You Ain't Gonna Need It principle) and create frameworks rather than customized solutions. You turned it on its side and asked "why should developers ever have to do this junk" and made it a criticism of the Java language.
Just so.

But in doing so, you take code decorated with "don't do this" and then point to it as an example of Java's failings.
You say "but what about the 1% of people who need all that extra nonsense?", but remember Mr. Woody's point is that developers should create bespoke solutions providing only the power necessary to solve the problem at hand rather than large general frameworks.


I don't think we are contradicting each other. Let's take the “Singleton” pattern as an example. Mr. Woody and I both agree that a Singleton would be an unnecessary complication to the factorial class, but we disagree on why. If you look at the Wikipedia article on singletons, you'll see that there are three examples of how to write singletons in Java:
// Examples taken from Wikipedia
public class Singleton {
 
    private static final Singleton INSTANCE = new Singleton();
 
    // Private constructor prevents instantiation from other classes
    private Singleton() {
    }
 
    public static Singleton getInstance() {
        return INSTANCE;
    }
 
}

// Solution suggested by Bill Pugh
public class Singleton {
 
   // Private constructor prevents instantiation from other classes
   private Singleton() {
   }
 
   /**
    * SingletonHolder is loaded on the first execution of Singleton.getInstance() 
    * or the first access to SingletonHolder.INSTANCE, not before.
    */
   private static class SingletonHolder { 
     public static final Singleton INSTANCE = new Singleton();
   }
 
   public static Singleton getInstance() {
     return SingletonHolder.INSTANCE;
   }
 
 }

// Modern way suggested by Joshua Bloch
 public enum Singleton {
   INSTANCE;
 }
All three are missing my point. They say, “The correct way to use singletons is to write your code like this.” What they ought to say is “The correct way for an implementation of singletons to behave is as if your code were written like this.”

A programmer who decides to use singletons (correctly or incorrectly) ought to be able to write something like:
public singleton FactorialUtil
{
    public static int factorial(int n)
    {
        if (n == 0) return 1;
        return n * factorial(n-1);
    }
}
Yes, it is stupid to do this for something like the factorial program, but it clearly states my (poorly considered) intention, was trivial to do, trivial to undo, and best of all, theoretically allows me to change the implementation of singletons from “traditional” to “Pugh style” without having to look at all the places singletons are used.

I'd have to post a much larger program and then spend time arguing the merits of singletons if I had to give a “real world” example.

Your point that Java requires significant boilerplate in order to say what needs to be said is valid, if not particularly novel. Even so, using code which was written in an intentionally convoluted way to criticize Java's convolution is like using the winner of the Obfuscated C Code contest to demonstrate that C is difficult to read.
It is hard to make concise arguments against convolution without needlessly convoluting a concise piece of code. A situation like this in real life usually involves a truly huge pile of awful code which makes the blog post far too large and contains many additional problems that hide the point I'm trying to make.

Jason Wilson said... The Development Chaos Theory article is largely concerned with factories. Factories are used all the time in Scheme and nobody ever complains because both the use and definitions are easy and natural:

USE:
((*factorial-function-factory*) 5)

POSSIBLE DEF:
(fluid-let (*factorial-function-factory*
(lambda (n) ....))

How *factorial-function-factory* got defined should not matter to the code depending on it. It's there. I promise. (Guice gives absolutely no more guarantees that it's defined and it is the best in class dynamic variable err, dependency injection framework for Java).

The Development Chaos Theory article is of course ridiculous for treating fact as a function that needs to be so abstracted, but without dynamic variables, this really is a necessary pattern in Java in many real world cases.

Exactly. There's just one refinement I'd suggest. (let ((factorial (make-instance <factorial-function>))) (factorial 5)) Object frameworks like CLOS provide a ‘universal factory’ generic function. And I'd suggest this definition of <factorial-function>
(defclass <factorial-function> () (:metaclass singleton))
The definition of the singleton metaclass would contain code analagous to Bill Pugh's implementation.

But this doesn't answer Mr. Woody's objection that this is completely unnecessary. In that case, I suggest this alternative definition of <factorial-function>
;; Unnecessary
;; (defclass  <factorial-function> () (:metaclass singleton))

(defmethod make-instance ((class (eql (find-class '<factorial-function>))) &rest initargs)
    #'factorial)
I have overloaded make-instance in this one case to not even instantiate an object but to return the factorial procedure as a natural singleton.

Friday, January 28, 2011

HowWhy (not) to write Factorial in Java.

In “Development Chaos Theory”, William Woody laments the habits of bad Java programmers. I blame the language.

Before I start ranting, let me first say that I agree completely with Mr. Woody. If you haven't read his article, I suggest you do now.

Let's follow the example.
/** 
 * @author wwoody
 */
public static int factorial(int n)
{
    if (n == 0) return 1;
    return n * factorial(n-1);
}
This is presented as a typical way someone might write this in Java. I might have written it slightly differently:
public static int factorial(int n)
{
    return (n == 0) ? 1 : n * factorial (n - 1);
}
There is only a subtle difference between these. return, in its fully general form, is a non-local exit. It returns out of the enclosing method instead of out of the enclosing form. The difference becomes obvious if you attempt to manually inline the factorial method.
public static foo (int x)
{
    int y = (x == 0) ? 1 : x * factorial (x - 1);
    ...etc.
}

public static bar (int x)
{
    int y = if (x == 0) return 1;
            return n * factorial(n-1);
    ...etc.
}
Either way appears to be acceptable Java to me.

There should already be warning bells ringing. I have to admit that I didn't hear them the first time I looked at this code, though. I'm so inured to this crap that I missed a glaring problem.

Simplicio: Why do these methods have type declarations?
Salviati: Java doesn't do type inference.

Simplicio: Ok, that's lame, but why not just declare everything to be Object?
Salviati: Objects are too general. The * method doesn't work on them.

Simplicio: Of course! We want the most general type that supplies ==, *, and - operations. So let's use Number.
Salviati: Hold your horses, cowboy. Arithmetic doesn't work on Numbers.

Simplicio: What? Whyever not?!
Salviati: Arithmetic only works on byte, double, float, int, long, and short.

Simplicio: factorial should work on any of these. Why do I have to pick one?
Salviati: The semantics of these types aren't compatible. They behave quite differently at the very ends of their ranges. There is no way to unify them under a superclass and still preserve the subclass semantics.

Simplicio: umm... whatever. What about that static thingy?
Salviati: That doesn't matter. You aren't using any objects.

Simplicio: If it doesn't matter, why do I have to specify it?
Salviati: It would matter if you were using objects. It tells the compiler that free identifiers should have class scope. By specifying static, you're indicating that you want lexical scope.

Simplicio: But that doesn't matter because I'm not using any objects.
Salviati: Right.

Simplicio: If it doesn't matter, why do I have to specify it?

Then Mr. Woody wraps the method in a utility class:
/** 
 * @author wwoody
 */
public class FactorialUtil
{
    public static int factorial(int n)
    {
        if (n == 0) return 1;
        return n * factorial(n-1);
    }
}
and suggests an iterative implementation
/** 
 * @author wwoody
 */
public class FactorialUtil
{
    public static int factorial(int n)
    {
        int ret = 1;
        for (int i = 1; i <= n; ++i) ret *= i;
        return ret;
    }
}
Simplicio: Ummm....
Salviati: You don't want to know. In this case, we're not really using the class, but you'll want it later to avoid namespace conflicts.

Simplicio: I guess... Isn't namespace management is more of a configuration control issue than a datatype issue? Wouldn't I want these to be orthogonal issues?
Salviati: They can be handled orthogonally, but that requires some serious hacking. Right now all you want is a simple factorial problem, so just use the default.

Simplicio: What default?
Salviati: Just pick a name. Remember, though, the class name should be the same as the file name.

Simplicio: If it is the default, why do I have to do anything?

Mr. Woody now suggests that BigInteger is the right return type. We'll fix the code:
public class FactorialUtil
{
    public static BigInteger factorial(int n)
    {
        if (n == 0) return BigInteger.ONE;
        return BigInteger.valueOf(n).multiply(factorial(n-1));
    }
}

alternatively...
/** 
 * @author wwoody
 */
public class FactorialUtil
{
    public static BigInteger factorial(int n)
    {
        BigInteger ret = BigInteger.ONE;
        for (int i = 1; i <= n; ++i) ret = ret.multiply(BigInteger.valueOf(i));
        return ret;
    }
}
What we're seeing here is already serious problem. Changing the range of the output is a trivial change, yet it requires large changes to the code. Here we have the original code and the modified code with the changes highlighted.
// before changes...
/** 
 * @author wwoody
 */
public class FactorialUtil
{
    public static int factorial(int n)
    {
        int ret = 1;
        for (int i = 1; i <= n; ++i) ret *= i;
        return ret;
    }
}
// after changes...
/** 
 * @author wwoody
 */
public class FactorialUtil
{
    public static BigInteger factorial(int n)
    {
        BigInteger ret = BigInteger.ONE;
        for (int i = 1; i <= n; ++i) ret = ret.multiply(BigInteger.valueOf(i));
        return ret;
    }
}

Sometimes, when you are building a big system, you want to transparently install a far-reaching package. This doesn't happen often, but it happens more often than you might think. Take, for example, the scmutils library. This library augments the standard generic arithmetic in MIT Scheme so that the usual arithmetic operators can handle vectors, matrices, and functions in addition to integers, rationals and reals. No changes need to be made to the code that calls the arithmetic methods. This makes life a lot easier for developers who will not have to remember whether to use the built-in arithmetic or to switch to the extended package.


Mr. Woody now points out that one might want to memoize the answers.
/** 
 * @author wwoody
 */
public class FactorialUtil
{
    static HashMap cache = new HashMap();

    public static BigInteger factorial(int n)
    {
        BigInteger ret;

        if (n == 0) return BigInteger.ONE;
        if (null != (ret = cache.get(n))) return ret;
        ret = BigInteger.valueOf(n).multiply(factorial(n-1));
        cache.put(n, ret);
        return ret;
    }
}
To quote, “easy enough.”

Except it isn't. Memoization is a general technique that has nothing to do with the factorial program. Yet it isn't abstracted out of the problem. Tom White wrote an article on how to use Dynamic Proxy Classes to perform this abstraction. The code is a bit convoluted, but that doesn't matter because when all is said and done, we can just memoize that call to factorial.

That's the theory, anyway. It turns out that you'll need to build an interface to the factorial class and then memoize that:
/** 
 * @author wwoody
 */
public interface FactorialAlgorithm 
{
  public BigInteger factorial(final int n);
}
/** 
 * @author wwoody
 */
public class FactorialUtil implements FactorialAlgorithm
{
    public BigInteger factorial(int n)
    {
        BigInteger ret = BigInteger.ONE;
        for (int i = 1; i <= n; ++i) ret = ret.multiply(BigInteger.valueOf(i));
        return ret;
    }
}
FactorialAlgorithm originalFactorial = new FactorialUtil();
FactorialAlgorithm memoizedFactorial =
   (FactorialProgram) Memoizer.memoize(new FactorialUtil());
That's better, but there's still a problem. All the places in the code where you used to call FactorialUtil.factorial(n) now need to change to use the interface instead: memoizedFactorial.factorial(n). And every caller has to have access to an instance of the dynamic proxy class at runtime.

Let's continue. Mr. Woody points out that we want to select which version of factorial to use at runtime. After all, we don't even need the BigInteger version if the actual data only contains single digit numbers. To do this, he suggests a factory class that can examine configuration data and instantiate the appropriate implementation. I won't bore you with the details.

Mr. Woody then asserts:
Sure, there are circumstances where a pluggable architecture would be desirable or even necessary—but that comes up so rarely it’s not even funny. About 99% of the time I see code like this, it’s completely and utterly useless. It’s obscuring the purpose of the code—replacing a two or three line utility (max!) with dozens or even hundreds of lines of self-important masturbatory Java bullshit. Sure, it may make you feel good—but it makes an ugly mess of it that future developers will have to clean up, or more likely just avoid like the plague.
Again, I want to stress that I agree with Mr. Woody and I'm not attempting to argue against him. And here I'd like to interject and raise a point. What about the rare situations? About 1% of the time you really do want a pluggable architecture, and code like this is not completely useless. The problem is that you still need “dozens or even hundreds of lines of self-important masturbatory (but necessary) Java bullshit.” Since there is a need for a pluggable architecture, you can be sure that future developers won't be trying to clean it up. Unfortunately, that leaves “avoid like the plague” as our option.

Mr. Woody says:
The biggest complaint I have with many Java developers is that they develop a whole bunch of really bad habits. Specifications are unclear, or they think someday the code may need to be extended into a different direction. So they write a whole bunch of overblown architectural nonsense, sight unseen, thinking that the additional crap someday will help out and make things easier.
But how does this differ from the case where the additional crap really does make things easier? The code doesn't change simply because it becomes more useful. It is still a whole bunch of overblown architectural nonsense. The only difference is that you are forced to use this crap because it is the only way to get things done when the simple solution is insufficient.

My complaint with Java is that it tends to exacerbate a bad situation. Small conceptual changes to code, like changing an int to a BigInteger, end up being large textual changes. Larger conceptual changes, like adding a Factory object or a dynamically pluggable implementation, cannot be abstracted out. This leads to big piles of additional crap that have to be replicated all over the place. The language is unfriendly to change and modern software development is all about managing changes to the body of code.

A day in the life

I seem to have a career as a professional software engineer.  Don't get me wrong, I love doing it, but somehow my avocation became my vocation.  C'est la guerre.

Last night I was thinking about what my job really entails and it is a bit different from what I usually think and tell people.  The exciting part is taking a poorly formed concept, refining it to a very detailed and well-formed formal concept, translating the ideas to software, and making a tool or service that people can use.  There is a tedious part, too.  Frankly, I don't find it too monotonous because it involves a lot of puzzle solving and finding creative solutions, but it is necessary work that doesn't seem logically necessary, and that tends to dampen my enthusiasm for doing it.

I spend a lot of time just “plumbing”.  There is a lot of infrastructure where I work.  In all the projects I've worked on, the vast majority of the code has already been written.  So the very first phase of software development is identifying the infrastructure subcomponents that the project should depend upon.  This is pretty easy to do.  Any basic project will always have these components:

  • A web server front-end with the following properties:
    • Automated load balancing.
    • Automated failover.
    • High availability.
    • Resistance against DOS attacks.
  • Persistent storage or a database with these properties:
    • Geographically separate replicas.
    • Regular backups.
    • Tools for manual editing and revision of stored data.
  • Software attendants to keep it running ‘unattended’.
My current project involves selling things (like all businesses need to do on occasion). There is infrastructure to handle purchases, refunds, deal with fraud, etc. We'll also need some more infrastructure to handle user authorization and authentication. And I didn't mention the infrastructure needed to deliver the purchased items. When it all comes down to it, there isn't a whole lot of logic involved in the project itself. Almost all the functionality comes from the underlying infrastructure. Our team spends a lot of time plugging it together. That doesn't mean that it is easy or that it doesn't involve any creativity.

I've found on most projects I work on that there is always some piece of software that I intend to use but I have only very vague ideas about how it might work. My task might be to hook it up.

Of course I spend several days thoroughly reading and understanding the copious documentation.

This is what I really do: I know that the end user is going to be sitting in front of his web browser and at some point click on something. This will eventually trigger a method call that performs the requested operation. Let's assume the user wants to buy something and that the mouse click the user made was the last one before the purchase (the one that asks “Ok to charge your credit card?”). What can I deduce?
  1. There ought to be a method that ultimately handles the ‘purchase’.
  2. Since buying something must be a very common operation, I expect there to be some easy way to charge the user's credit card.
    1. I first look for the ‘quick start’ guide or the FAQ.
    2. If I can't find an answer there, I look at the table of contents of the detailed documentation.
    3. A lot of the time there isn't any documentation or if there is, I can't find it. The end result is the same. If there is no documentation, then I think whether there are similar projects that might want to use the same infrastructure in the same way. I'll eventually find some code that actually works.
    4. Now comes a trick. I sketch out in my head a few possible protocol designs. There should be a way, given a user and an item, to “begin” the process of purchasing. There should be a way to determine if the process has completed. There should be a way to determine the outcome of the process. Each of these mechanisms should exist as separate calls because we might have to re-do any part of the process if the computer crashes during the purchase. But since the usual case will involve rapid turnaround and no crashes, there should be an abstraction that handles the entire process from end to end. I poke around in the docs and in the source code to see if these things exist.
    5. Now I analyze what I've found and make a guess at the sophistication of the design and the programmer. If I see an API that involves continuation-passing style or a variant (Java callbacks are poor-man's CPS), then I expect a high level of sophistication from the API. I expect there to be a ‘standard’ way of purchasing something and I expect that all the edge cases are handled in a reasonable way. Why? Anyone that successfully uses a complex concept like continuation-passing style has most likely applied a lot of serious thought to the API.
    6. On the other hand, if I see an API that involves a lot of methods that return void and must be called in the right order or they won't work, or an API with ambiguous methods like checkPurchase() and validatePurchase() (which one do I use?), or one where the arguments must be strings that will then be parsed to something that makes more sense, or APIs that throw only a single catch-all error (or just return null on success or maybe null on failure), then I expect that I'm going to have to think hard about all the edge cases myself and I'm going to have to wade through a lot of examples to determine the right ‘magic’ Why? People that write APIs like this aren't thinking. They code up the first solution that comes into their head and when they are done, they call it an API. It is likely that they either didn't think there were edge cases, or it didn't occur to them that they should abstract them away.
  3. So I finally determine what methods need to be called. I look at the argument lists of the methods and I look at the arguments passed in to the servlet. These are generally isomorphic in some way, but it is typically the case that the servlet is being passed a string representation and the underlying method needs an object. If the API isn't completely idiotic, it will provide parsers and unparsers for the data that represent major concepts. So the servlet might get a string that represents a user and a string that represents what he wants to purchase. Now I have to figure out how to call the method.
    1. If the API is well designed, there might even be a method that takes the strings directly from the servlet and does all the work including checking inventory and scheduling order fulfillment. This is the rare case.
    2. If the API is pretty good, then I'll probably have to make the appropriate calls to parse the servlet arguments to actual data, handle errors, and validate the input. Then, with the parsed arguments in hand I can call the purchase method.
    3. If the API is lame, then after the argument parsing I'll have to check the database to see if there is inventory and somehow reserve it. Then I'll run the purchase methods (which will not be unified into one abstraction, but consist of five methods that need to be called in the right order). Then I'll have to see if the purchase succeeded, and if not, return the reserved item to inventory, or if it did, arrange for the item to be delivered.
So how do I know how to use the API if it is undocumented? The simple answer is this: look for methods that have the same type signature as the objects that are at hand in the servlet. There probably won't be very many. Now look at the names of the methods and see if any make sense. The easyPurchase(User user, Item item) seems a lot more likely than the getReceipt(User user, Collection items). A really good API uses names that practically shout out their purpose.

Finally I try out the code and see if it works. If it does, I call it day. If not, then it is time to twiddle, tweak, and frob things. If the API is good, then I'll get informative error messages that tell me what I did wrong. Something along the lines of “Error: user id is an email address, did you forget to call parseUser()?” instantly tell me what I'm doing wrong. Something along the lines of “null pointer exception” followed by a stack trace filled with methods I never heard of before now tell me nothing whatsoever.

This is work. It isn't extremely difficult, but it does require practice, skill, and imagination. It also isn't really engineering. It is more like plumbing. Get this stuff here and put it over there, and you might need an adapter fitting.

Monday, January 24, 2011

A bit of history

The computers of the 50s were very primitive machines. The IBM 704, for example, had 4096 words of ferrite core memory (later expandable up to 32768 words) with an access time of 17 milliseconds. The Central Processing Unit, which was not a monolithic silicon chip, but a cabinet full of vacuum tubes, could execute 40 thousand instructions per second (twenty-five microsecond basic execution cycle).

It cost $50,000 a month just to rent one. (That's more than $4 million a year in today's dollars). And this was one of the most advanced machines available at the time. These days, the level 1 cache control would put this kind of power to shame.

The computer languages at the time were bizarre to say the least. Of course no one at all had any experience programming, let alone designing languages to make programming easier. The conventional wisdom at the time was that ‘information processing’ required an approach that differed fundamentally from ‘numeric calculation,’ so people tried out all sorts of weird ideas. One idea was string processing. The COMIT language was one of the earliest string processing languages that was based around the idea of rewrite rules. It is sort of like sed on acid:
(8             PROGRAM TO BRING IN A NUMBER OF CONSTITUENTS AND PRINT OUT-
               ALL OF THE N FACTORIAL PERMUTATIONS OF THEM)
                       (PROBLEM SUGGESTED BY B. ULVESTAD)

INPUT                                $ = Q + 1 + N       //*RSA1              INPUT
*                                    N = QL + QR/.1                               *
NUMBER Q + $ + $1 +  QL +  $ + QR  + N = 1 + 2 + 4 + 3 + 5 + 7/.*6 + 6/.I1   NUMBER
*              Q + QL + $ + N + $ + QR = 3 + 2 + 6/.1 + 4 + 5                     *
PRINT                          $1 + QL = 2 + 1          //*WAB2               PRINT
*                 QL + $ + $1 + QR + N = 2 + Q + 3 + 1 + 5 + 4                    *
PERMUTE $1 + Q + $1 + $ + QL + $ + QR + N = 3 + 2 + 4 + 1 + 5 + 6 + 7 + 8/.D1     *
TEST               QL + $ + QR + $/.G0 = 1 + 3 + 2 + 4                       NUMBER
MOVE               $1 + Q + $ + QR + 1 = 2 + 1 + 3 + 5 + 4                  PERMUTE
Yngve, Victor H. 1958. “A programming language for mechanical translation.” Mechanical Translation 5 (1). 25-41

The information processing language, IPL, on the other hand is more like assembly code with linked list support:
            COMMENTS                TYPE  NAME  PQ  SYMB   LINK

Type-9 card                          9
                                     1
Type-2 cards define regions.         2     A                1
The B-Region consists of             2     B                1
 one cell, named "B0" or just "B".   2     I                1
                                     2     L                2
The R-region has 2 cells,            2     R                2
 named "R0" and "R1".                2     v                1
                                     2     (                1
                                     2     )                1
Type-1 cards are for comments.       1
                                     1
Type-5, Q=1 says "DATA FOLLOWS."     5           1

L1---(AI(A v B))                          LI        0
                                                    (
                                                    A
                                                    I
                                                    (
                                                    A
                                                    v
                                                    B
                                                    )
                                                    )        0

Type-5, Q=blank says "ROUTINES       5
 FOLLOW."                            1
                                     1
R1 prints the list L1.               1
                                     1
Q=3 says "TRACE THIS ROUTINE."            R1     13 L1
                                                    J151    0

P or Q = 0 can be left blank.

Type-5, SYMB=R1 says execute R1      5              R1
Correspondence between Hugh S. Kelley and Chuck Baker.

These languages differ from INTERCAL in that their designers were serious.

A notable exception was the language ALGOL.
begin integer i; real y; real array a[0:10];
   real procedure f(t); real t; value t;
     f := sqrt(abs(t)) + 5*t^3;
   for i := 0 step 1 until 10 do read(a[i]);
   for i := 10 step -1 until 0 do
     begin y := f(a[i]);
           if y > 400 then write(i, "TOO LARGE")
           else write(i,y);
     end
   end
(taken from Wikipedia article on the Trabb Pardo-Knuth algorithm)

ALGOL doesn't look unusual to the modern eye. That in itself is testimony to the fact that ALGOL was years ahead of its time. As you can imagine, though, it was well beyond the capability of most computers to compile and run an ALGOL program.

When McCarthy started the “Advice Taker” project, he proposed a new language LISP that would be based around the linked-list paradigm of IPL and the λ-calculus of Alonzo Church. One of the goals of the language was to deliberately move away from the Turing-inspired mechanistic model of computation (Turing machine) and towards a more mathematical way of programming based upon recursion. (Incidentally, ALGOL was designed to support recursion, but the very first ALGOL implementations punted because it was too complicated to support and users didn't know how to use it.)

Wednesday, January 19, 2011

Fifty Years Ago

The character-reading functions make it possible to read characters one by one from data records. The choice of input medium is determined by sense switch 1, though at a future time mode-set facilities may be added so that input may be taken from any medium. This description will be given in terms of hollerith cards [sic] as input; the procedure for tape is completely analogous. However, the tape records must be 72 characters long.
Paul Abrahams, Artificial Intelligence Project
RLE and MIT Computation Center Memo 22a
Character-Handling Facilities in the LISP System


In the months following McCarthy's initial description of ‘A Symbol Manipulating Language’, Steve Russell translated McCarthy's eval into IBM 704 machine code and the first LISP was born. By 1961 the implementation of Lisp 1.5 was well under way and a number of new libraries were being written.


Hollerith cards were the main input medium in those days. They were a step up from punched paper tape, though. In fact, punched cards were pretty much the only input medium until online input became available.
Paul Abrahams, personal correspondence


Lisp had read, eval, and print, but the computer itself was shared by the MIT Community. You would submit your card deck for processing and come back some time later to get the printout.  The REPL was run in batch mode.

Tuesday, January 18, 2011

Answers for Faré

Faré said...
Problem is that indeed you need to do something about those slots without keyword initializers, slots that are computed from other slots (e.g. hash values), slots that are initialized from a counter (e.g. OID), slots that are actually plumbing from a larger structure (e.g. indexes back into other objects), etc.

You would think so, but it turns out you don't.

When an object is created for the very first time, you must have already written code to deal with slots without keyword initializers, slots computed from other slots, slots that are initialized from a counter, slots that are plumbing, etc. or you wouldn't have been able to create the object. When the object is re-instantiated from the persistent store, there is no reason you cannot simply perform those operations (mutatis mutandis) again.

But what if these operations are not idempotent? Actually, we already know they are not. Each time we call the constructor, we ought to be getting a brand new object, so we don't want the operations to be idempotent. But note that the object is constructed exactly once per ‘session’ — the constructor is never called twice without dropping all references to the constructed object in-between the calls. Therefore, it is not possible to ever observe two separate calls to the constructor. (By ‘observe’ I mean, “You cannot write an extensional program that returns 0 if exactly one call to the constructor occurs, but returns 1 otherwise.“)

Certainly one could, through reflection and other mechanisms write code that intensionally exposes the implementation, but one can always write code that deliberately breaks an abstraction barrier. Although it seems far too easy to break this abstraction barrier by something as mundane as computing a slot value from other slots, in practice it turns out that you won't want to. Again, this can be derived from an information-theoretic argument. At the point of invoking the constructor, the program has provided all the information necessary to correctly initialize the new instance. Any other information used is, by definition, ‘implicit’.

Let us assume, for a brief moment, that the implementation depends critically upon the implicit part of the initialization process. I simply argue that critical dependence upon implicit information is a serious bug, regardless of whether persistence is brought into the picture or not. If the implicit information is designed to be implicit, then the onus is upon the designer to hide the implicit dependency well enough that the client code need not be aware of it.

So let us assume the opposite: the implementation does not depend cricitally upon the implicit part of the initialization process. Well, if there is no critical implicit dependence, there are no bugs that occur because of a critical implicit dependence.

The approach still works, but is actually very low-level, and calls for a higher-level interface of some sort, lest your whole system keep a very low-level feel.

Not at all. In ChangeSafe, the abstraction of a versioned object is built directly upon the persistent object abstraction by layering yet another set of MOP customizations on the existing ones that implement persistence. There is no higher-level interface beyond the normal CLOS interface, yet there is very little code that needs to have explicit calls to the persistence API.

I doubt these arguments will persuade anyone, so I suggest trying to implement things this way and seeing for yourself.

And this reminds me of another thing. Earlier you asked why ‘pointer swizzling’ was performed on reading rather than upon loading of the persistent object. I mentioned a few reasons, but I forgot one really important one: it allows you to build persistent circular structure without needing mutable objects.

Thursday, January 13, 2011

Some Bells and Whistles

There are a few bells and whistles that I added to my persistent store. The first is multiple extents (that is, more than one persistent store open and participating in a transaction). This is fairly easy to add. The only tricky part is ensuring that a transaction that involves multiple extents either fully commits or aborts across all participating stores. The way I chose to implement this is by choosing a single particular persistent store to be the transaction master. The commit record in the transaction master is the one that holds the authoritative record of whether the entire transaction has committed.

The commit proceeds in two phases. In the first phase, each non-master store performs a ‘conditional commit’. This involves flushing the pending writes to the backing store and writing a special commit record that refers to the transaction master. Once each non-master store has written the special commit record, the commit record is written to the master store.

If nothing unusual happens, processing may continue as in the normal case. The interesting thing happens when the non-master store is first opened. If the most recent commit record in a store is a ‘conditional commit’, then it is necessary to check the transaction master to see if it committed. Typically, the relationship between the different stores is that there is one master and one or more ‘satellite’ stores. Therefore, it is likely that the transaction master is already open and within a transaction when the satellite store is opened, so the master commit record will be easily at hand. If not, then the code recursively opens the master store to determine if the conditional commit of the satellite is valid. This adds a little amount of bookkeeping to the commit code, but it isn't particularly complex.

Nested Transactions and Multi-version Concurrency Control (MVCC)

My original plan had been to avoid the complexities of nested transactions and MVCC. I designed this as a simple store, and I figured that a good number of applications would not need nested transactions. I discovered almost immediately that I was wrong. If nested transactions are disallowed and a procedure needs to start a transaction, it has to determine whether an existing transaction is in progress. If it is not, then the procedure needs to begin the transaction and ensure it is committed when it exits. On the other hand, if there is an existing transaction, the procedure should ‘join’ that transaction and let whoever started the transaction decide whether or not to commit. This isn't so bad if you can segregate your procedures into ‘top-level’ (ones that always start and commit a transaction) and ‘subroutines’ (ones that never start and commit a transaction, but expect there to be one in place before being called). Unfortunately, recursive procedures cannot be in both categories simultaneously. A possible workaround is to have duplicate code tailored for each use case, but then the programmer would have to remember which version to call under which circumstances, or we'd have to avoid using recursion. Since this is clearly unacceptable, I bit the bullet and implemented nested transactions.

With the current design, nested transactions turn out to be pretty simple. When a non-nested transactions starts, it gets a copy of the most recent object-map which it uses for OID resolution. When a nested transaction starts, it gets a copy of the current object-map from the immediately enclosing transaction. The nested transaction gets to see the intermediate state of the enclosing transaction. If the nested transaction commits, it does not write a commit record to the store. Instead, the object-map from the nested transaction is used as the object-map of the enclosing transaction when control is returned. Thus the enclosing transaction gets to see the state that the nested transaction committed. When the enclosing transaction commits, the object-map (which includes the nested transaction updates) is written to the store and committed.

If the enclosing transaction aborts, the nested transaction state is simply dropped on the floor and becomes inaccessible. The interesting question is how to handle the case where the nested transaction aborts. In this case, we return control to the enclosing transaction, but we discard the object-map from the nested transaction. The enclosing transaction therefore sees no effect at all from the nested transaction. The enclosing transaction has the opportunity to itself abort (in which case all the intermediate state is discarded), to commit (in which case only the effect of the enclosing transaction is made permanent), or to perhaps continue processing and begin more nested transactions.

In effect, nested transactions act like software transactional memory.

At this point, one can see how MVCC can be easily implemented. Each concurrent thread runs its own transactions with its own object-map. Since all OID resolution is performed against the object-map associated with the current (thread local) transaction, each concurrent transaction sees only its view of the persistent store. (This is a necessary condition, but it is not sufficient. We have to ensure that committed transactions can be given a serial order to preserve the isolation between transactions. We do this by recording in each transaction the OIDs of every persistent object read and written within the transaction. When the transaction attempts to commit, the set of reads and writes is compared against the set of objects read and written by any other parallel transaction that has committed since this one started. If there is an overlap, we abort. That is, we use optimistic locking.)

(This should answer Fare's question about why OID resolution is performed on each read rather than being done just once when the store is opened.)

These features involve adding a good chunk of bookkeeping code, but the code is straightforward, albeit tedious. Good testing code is vital to ensuring that this all works as it should.

Friday, January 7, 2011

Schema evolution finessed

Schema evolution is the bĂȘte noire of the object-oriented database world. It is probably the most labor intensive and error prone activity that an end-user is likely to attempt. The root problem is that the metadata about persistent objects is implicitly replicated in several ways.

The big advantage of object-oriented databases is that they unify the query language with the ‘business logic’ language. You access and navigate persistent objects in the exact same way you access and navigate the ordinary data structures in the language. The obvious benefit is that the mechanisms of persistence and querying are hidden behind the object abstraction. But this means that the definition of the data structures are performing double duty. Data structure definitions are used by the compiler to generate the appropriate primitive operations in the code, but they also are needed by the database component to generate the persistent representation of objects. If the data definition changes, objects that conform to the old definition will no longer work.

Programmers generally expect that a recompile is all that is needed to change how a program handles data structures. But if the data definition is used by the database for object storage and retrieval then changing the data definition may change the persistent representation and make it impossible to retrieve data. (Or worse, it could silently retrieve the data incorrectly.)

It turns out that if persistent objects are immutable then there is a way to finesse the issue of schema evolution.

In CLOS, objects are created and initialized when the programmer invokes make-instance. There is a well-defined series of operations that take place and the resulting object is returned to the caller. The important thing here is the intent of the programmer. The arguments given to make-instance contain the specific information that the programmer has determined is needed to correctly initialize the object. The returned instance will depend on the argument values and any default values that are supplied in the class definition. From an information theoretic view, the argument list to make-instance and the resulting instance ought to contain the same information. So rather than attempting to persist the resulting instance, we instead persist the argument list to make-instance.

The immediate objection to this is that there is no way to know what on earth make-instance constructed! It could be the case that make-instance decides to return NIL. There could be a very complex initialization protocol that defaults certain slot values and computes other slot values from the initargs. This doesn't matter. The programmer is working at the abstraction level where the entire process is kicked off by the invocation of make-instance, and thus his intent is simply to create an instance of an object that is parameterized by the given initargs.

When we retrieve an object from the database, we don't attempt to retrieve the representation of the constructed object, we instead retrieve the arguments that were handed to make-instance and simply invoke make-instance again to re-create the object.

Now suppose that we have a database where we are modeling automobiles. Let us assume that we have already stored thousands of automobile instances and that each instance contains, say, the number of doors and the color of the automobile. At some point we decide that we no longer care about the number of doors, but we do want to know the make and model of the automobile. We change the definition of the automobile class in the program. When we invoke make-instance, we now have a different set of initargs. However, the legacy objects in the database are reconstructed by calling make-instance with the old argument lists. This is easily dealt with. The older argument lists will have initargs describing the number of doors and the color of the car. We arrange for make-instance to ignore any initargs that are unused, and we require the programmer to supply ‘reasonable defaults’ for initargs that are not supplied. Thus when retrieving an older object, the count of the number of doors will be ignored, and a default value will be supplied for the make and model.

Note that we no longer require a schema to describe the layout of objects in the database.

There are drawbacks to this strategy. First, we are making the assumption that all the relevant information for object creation is supplied in the call to make-instance. A perverse programmer can easily bypass this assumption. However, it doesn't seem unreasonable to require that programmers either use the object creation protocol in the intended manner, or to specifically arrange for auxiliary helper objects to be created that hold any missing relevant information. This is meant as a tool for programmers, not as a fool-proof system. (Incidentally, the full-blown system I wrote allows the programmer to declare certain slots as transient. This allows a programmer to cache information needed for the runtime without forcing the information to be persisted.) Second, we give up the ability to side effect persistent objects. (As I have noted in previous posts, we can mimic side effects reasonably easy.)

These disadvantages are minor compared to the difficulties of schema evolution.

I'm happy to answer questions about this.

Tuesday, January 4, 2011

EQL Specializers

Pascal Costanza asked an interesting question: Why are EQL specializers convenient for smooth integration with object instantiation?

My original answer was simply that when you want to customize make-instance for a particular class of objects that you end up writing methods like this:
  (defmethod make-instance ((class (eql (find-class 'foo))) ...
This is doable without EQL specializers, but it would be kludgy.

While thinking about this some more, though, it occurred to me that EQL specializers are probably a necessary element for reflective programming.

When you reflectively program the object system, you operate at a meta-level where abstractions such as classes are reified as concrete instances of a meta-class. At the reflected level, object instances are no longer the data being manipulated. Instead, object classes are used to represent sets of instances, and the classes themselves are represented by instances of meta-class objects.

When you cross abstraction levels in this way, you need a set of level-shifting primitives. One of the most important of these is quotation. Quotation is level-shifting primitive that lifts (injects) an instance from beneath the abstraction level to the level itself.

In CLOS, EQL specializers are the quotation mechanism for method dispatch.