Wednesday, December 15, 2010

More persistence

In my last post, I described how to map OIDs to persistent objects by means of a functional, weight-balanced, binary tree. In this post, I will describe how to implement aggregate objects. I'm going to start with a high-level view and drill down to the details. I'm afraid at this point I'm going to switch to Common Lisp and CLOS. Don't worry, though, if you just pretend that defclass is a synonym for define-structure, you won't be far off.

Here's a simple CLOS class definition:
(defclass test-class ()
  ((name :initarg :name
         :initform 'zippy)))
Objects of class test-class have a single field which is called name. This class definition adds a method to make-instance that works as follows:
  ;; make an instance where the name is the symbol GRIFFY 
  (defvar test-instance-1 (make-instance 'test-class :name 'griffy))

  ;; alternatively, use the default name of ZIPPY
  (defvar test-instance-2 (make-instance 'test-class))
In addition, we can use slot-value to extract the name:
  (slot-value test-instance-1 'name)
  => GRIFFY

  (slot-value test-instance-2 'name)
  => ZIPPY
And that's that. (And you thought CLOS was complicated.)

A persistent class is defined as follows:
(defclass test-class ()
  ((name :initarg :name
         :initform 'zippy))
  (:metaclass persistent-standard-class)
  (:schema-version 0))
The :metaclass specifier tells CLOS that we want instances to be persistent. The :schema-version is a bit of magic that protects against the unlikely, but possible nasty case that over time a class definition might evolve where a slot name is accidentally re-used. In such a case, the code will need to disambiguate which class definition was intended and it can use the :schema-version. You can safely ignore this by following this simple rule: If you ever modify the slots in a persistent class, you should increment the :schema-version.

So what is different? On the surface, practically nothing changes. The calls to make-instance and slot-value work just as before. There are four changes that happen ‘under the covers’
  1. The object returned by make-instance will be a persistent object.
  2. Two additional slots are allocated. The first contains the OID of the instance itself, the second contains the persistent store where the instance was allocated.
  3. The name slot in the object will contain the OID of the persistent name rather than a Lisp object. make-instance is modified to extract the OIDs of its arguments to initialize this slot.
  4. slot-value must be modified to dereference the OID that is actually in the slot.
These modifications are easily made using the CLOS MOP.

We use the meta-object protocol to override slot-value-using-class as follows:
(defmethod clos:slot-value-using-class ((class persistent-standard-class)
                                        (object persistent-standard-object)
                                        slot)
   (persistent-object/find 
     (persistent-standard-object/pstore object)
     (call-next-method)))
The call to call-next-method invokes the underlying mechanism for reading a slot value. But we're not storing the actual slot value in the slot, we are storing the OID of the actual slot value, so call-next-method will return that OID. We look up the returned OID in the persistent store to get the value. (In order to support multiple persistent stores, we need to know which store contains the value. We require that all subfields of an object be allocated in the same persistent store as the containing object.)

The modifications to make-instance are more complex. We actually don't modify make-instance directly, but instead we modify clos:shared-initialize (which is invoked by make-instance as part of the instance allocation protocol).
;; Override the primary shared-instance method
;; in order to deal with persistent slots.
(defmethod clos:shared-initialize ((instance persistent-standard-object) slot-names
                                   &rest initargs
                                   &key persistent-store oid
                                   &allow-other-keys)
    ;; We have to wrap the initargs and initforms
    ;; in persistent-objects and create an initializer
    ;; for this object.
    (let* ((class (class-of instance))

           (init-plist (compute-persistent-slot-initargs class
                                                         (or persistent-store  *default-persistent-store*)
                                                         initargs))

           (oid (persistent-object/save
                     (make-initializer class
                                       (class-schema-version class)
                                       init-plist)
                     (or persistent-store  *default-persistent-store*)
                     oid)))

      (apply #'call-next-method instance slot-names (nconc init-plist initargs))
      (setf (persistent-standard-object/oid instance) oid)
      instance))

(defun compute-persistent-slot-initargs (class persistent-store initargs)
  "Scan over the persistent effective slots in CLASS,
   determine the value to be assigned to each slot, either
   from the initargs or from the initfunction, then
   using the persistent-initarg as a key, construct a
   plist for use in the persistent initializer and in
   the inner call to shared-initialize."
  (let ((result nil))
    (iterate (((slot-initargs slot-initfunction slot-persistent-initarg)
               (map-fn '(values t t symbol)
                       #'effective-slot-initialization-info
                       (scan-class-persistent-effective-slots class))))
      (let ((initial-value (slot-initial-value initargs slot-initargs slot-initfunction
                                               (clos::slot-unbound-value))))
        (unless (eq initial-value (clos::slot-unbound-value))
          (push slot-persistent-initarg result)
          (push (slot-value->persistent-node persistent-store initial-value) result))))
    (nreverse result)))
In compute-persistent-slot-initargs, you can see the call to slot-value->persistent-node that extracts the OID from the initarg. In shared-initialize, the call to persistent-object/save writes an initialization record to the store and then allocates and initializes the transient, in-memory version of the object.

These are the essential changes to the MOP to implement the basic persistence mechanism.

More details to come in the next post.

8 comments:

Xach said...

Thanks for these writes-up!

Is this code from ChangeSafe?

Faré said...

Is persistent-object/find a macro? Or is it a function to be called *before* the call-next-method, that you misparenthesized?

Why not use a Scheme object system?

Maybe you should test your code as you post it.

Joe Marshall said...

Xach asked: Is this code from ChangeSafe?

This is an amalgam of ideas from several projects.

Faré asks: Is persistent-object/find a macro? Or is it a function to be called *before* the call-next-method, that you misparenthesized?

Neither. It is correct as is.

Why not use a Scheme object system?

That would add a lot of extra non-portable code. I'll consider it, though.

Maybe you should test your code as you post it.

Indeed. Thank you for finding even more work for me to do.

Anonymous said...

This looks like pretty similar to Rucksack:
http://common-lisp.net/project/rucksack/

Unknown said...

In you CLOS example:

(slot-value test-instance-1 'name)
=> ZIPPY

The first argument to `slot-value' should read `test-instance-2'.

Faré said...

You mean, you do pointer-swizzling at slot-access time rather than instance-initialization time? Meh. I suppose it makes sense if you have potentially large objects and almost always bind swizzled pointers to lexical variables just so as to avoid concurrent interference in accessing the slot twice.

Joe Marshall said...

ECL said: The first argument to `slot-value' should read `test-instance-2'.

Fixed.

Joe Marshall said...

Faré asked: You mean, you do pointer-swizzling at slot-access time rather than instance-initialization time?

Yes. We're going to manage several different views of the database by juggling the object-map. The next few posts should make this clearer.