Tuesday, January 18, 2011

Answers for Faré

Faré said...
Problem is that indeed you need to do something about those slots without keyword initializers, slots that are computed from other slots (e.g. hash values), slots that are initialized from a counter (e.g. OID), slots that are actually plumbing from a larger structure (e.g. indexes back into other objects), etc.

You would think so, but it turns out you don't.

When an object is created for the very first time, you must have already written code to deal with slots without keyword initializers, slots computed from other slots, slots that are initialized from a counter, slots that are plumbing, etc. or you wouldn't have been able to create the object. When the object is re-instantiated from the persistent store, there is no reason you cannot simply perform those operations (mutatis mutandis) again.

But what if these operations are not idempotent? Actually, we already know they are not. Each time we call the constructor, we ought to be getting a brand new object, so we don't want the operations to be idempotent. But note that the object is constructed exactly once per ‘session’ — the constructor is never called twice without dropping all references to the constructed object in-between the calls. Therefore, it is not possible to ever observe two separate calls to the constructor. (By ‘observe’ I mean, “You cannot write an extensional program that returns 0 if exactly one call to the constructor occurs, but returns 1 otherwise.“)

Certainly one could, through reflection and other mechanisms write code that intensionally exposes the implementation, but one can always write code that deliberately breaks an abstraction barrier. Although it seems far too easy to break this abstraction barrier by something as mundane as computing a slot value from other slots, in practice it turns out that you won't want to. Again, this can be derived from an information-theoretic argument. At the point of invoking the constructor, the program has provided all the information necessary to correctly initialize the new instance. Any other information used is, by definition, ‘implicit’.

Let us assume, for a brief moment, that the implementation depends critically upon the implicit part of the initialization process. I simply argue that critical dependence upon implicit information is a serious bug, regardless of whether persistence is brought into the picture or not. If the implicit information is designed to be implicit, then the onus is upon the designer to hide the implicit dependency well enough that the client code need not be aware of it.

So let us assume the opposite: the implementation does not depend cricitally upon the implicit part of the initialization process. Well, if there is no critical implicit dependence, there are no bugs that occur because of a critical implicit dependence.

The approach still works, but is actually very low-level, and calls for a higher-level interface of some sort, lest your whole system keep a very low-level feel.

Not at all. In ChangeSafe, the abstraction of a versioned object is built directly upon the persistent object abstraction by layering yet another set of MOP customizations on the existing ones that implement persistence. There is no higher-level interface beyond the normal CLOS interface, yet there is very little code that needs to have explicit calls to the persistence API.

I doubt these arguments will persuade anyone, so I suggest trying to implement things this way and seeing for yourself.

And this reminds me of another thing. Earlier you asked why ‘pointer swizzling’ was performed on reading rather than upon loading of the persistent object. I mentioned a few reasons, but I forgot one really important one: it allows you to build persistent circular structure without needing mutable objects.

2 comments:

Faré said...

I'm not convinced yet. Case in point: OIDs. Two different $1000 gizmo-buying transactions by the same person at the same time should get different ID numbers, so they can be distinguished in further transactions (e.g. return&exchange).

Therefore, OIDs must be part of the persisted slots, even though you don't want them to be specified by the user in the API. Standard answer would then be that OIDs should be generated by some wrapper before the make-instance method is called. At which point, make-instance becomes a low-level function that the user must not, ever, directly call, which abstraction the language does not allow to enforce, making it a low-level language.

That's more a general condemnation of Lisp than of your approach: it doesn't allow to disallow.

Joe Marshall said...

You won't be convinced until you try a few examples. Case in point: OIDs. OIDs would be persistent slots, but they would have an :initform that allocates and initializes a new persistent OID. (Objects that are placed in persistent slots should themselves be persistent, naturally.)