Tuesday, December 14, 2010

Oh yeah, about that persistent store....

In April of last year I was describing a simple persistent object store. I never finished. Mea culpa. I was thinking about it again and I realize why. The next step in the implementation is a really big one. We're going to solve five unrelated problems in one fell swoop:
  1. Transactions — we will add the ability to roll-back partially completed operations.
  2. Aggregate objects — up to now, we only allow primitive objects to be persisted. We will add the mechanism to allow aggregate structures to be stored and retrieved.
  3. Side effects — the backing store is opened in append mode, so mutation of existing objects is not possible. Since we only allow primitive immutable objects to be stored, this is not an issue. Once we have aggregate objects, however, we will want to simulate the ability to side-effect these objects. The transaction system will have to know how to undo these effects if the transaction aborts.
  4. Schema migration — Persistent objects live a long time. Over time, it is usually the case that we will want to change what fields are contained in an aggregate object. Removing a field is easy, but adding or renaming an existing field can be tricky because legacy objects will be missing the field. Furthermore, if the aggregate object model is rich (for example, if it supports inheritance), some changes to the model might require making extensive changes to the existing persistent objects.
  5. Internal and external references — up to now, an object cannot refer to another object. There is also no way to refer to a particular persistent object other than by directly using it. For example, although we could store two different strings in the persistent store, and we can recover both, we cannot identify which one is ‘first’ and which one is ‘second’.
I haven't figured a way to truly separate these five different features, so this post is going to be a long and complex one.

We need to introduce object identifiers (OIDs). An OID is an integer that refers to a unique persistent object. Every persistent object in a particular persistent store is given a different OID (we do not require that OIDs are universally unique).

At this point, let me illustrate the code.
(define-structure (object-map-entry
                   (type list)
                   (conc-name object-map-entry/)
                   (constructor %make-object-map-entry
                                (object-id address cached-value)))
  (object-id #f read-only #t)
  (address #f read-only #t)

  ;; *** A slot holding the cached value greatly improves
  ;;     performance!
  (cached-value #f read-only #t)
  )

Each object-map-entry associates an object-id with an address. The address is the location within the backing store where the object has been serialized. The cached-value not only speeds up object retrieval, it ensures that every dereference of an OID yields the same EQ? result. Extrinsic identity — the existence of a universal ‘identically equal’ predicate — is a key assumption of Lisp, so it is important to avoid violating this assumption.

We will be maintaining a data structure that maps OIDs to objects. This data structure will itself need to be persistent. Over time, we expect the ratio of new objects to old objects to decrease, so a data structure that supports incremental update will be much more efficient than one that requires wholesale replacement. A weight-balanced binary tree is one way of implementing this map.

A tree has two kinds of nodes: leaf nodes, which hold a single object-map-entry, and branch nodes, which hold sub-trees in addition to an object-map-entry. Object lookup requires descending into the tree to find the object-map-entry with the correct OID.
(define (persistent-tree/descend object-id current-tree best-tree)
  (cond ((null? current-tree) best-tree)
        ((< object-id 
               (object-map-entry/object-id (persistent-tree/object-map-entry current-tree)))
         (persistent-tree/descend object-id 
                                  (persistent-tree/left-child current-tree) best-tree))
        (else
         (persistent-tree/descend object-id
                                  (persistent-tree/right-child current-tree) current-tree))))

(define (persistent-tree/find-entry root object-id)
  (let ((best-tree (persistent-tree/descend object-id root '())))
    (if (null? best-tree)
        #f
        (let ((entry (persistent-tree/object-map-entry best-tree)))
          (if (< (object-map-entry/object-id entry) object-id)
              #f
              entry)))))
Adding an entry to the OID tree is much more complex. Binary trees work best when they are balanced, so when we add a node to the tree we may want to rebalance the tree. This may require creation of new internal nodes of the tree. In addition, we want our tree to be persistent, so any new nodes must be written to the backing store. However, we want the tree to be efficient — adding a node and rebalancing the tree should not require too much computation and disk traffic.
This particular implementation uses a `functional binary tree'. Updates to the tree do not modify existing data structures. Instead, new nodes are allocated and as much of the existing tree as possible is re-used. (See Stephen Adams `Implementing Sets Efficiently in a Functional Language' http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.48.1134 for the details on the algorithms.) Here is the code:
(define-structure (persistent-tree
                   (conc-name persistent-tree/)
                   (constructor make-persistent-tree (left-child-address
                                                      right-child-address
                                                      %weight
                                                      %address
                                                      left-child
                                                      right-child
                                                      object-map-entry)))

  ;; Address of left child of the tree.
  ;; This field is persistent.
  (left-child-address #f read-only #t)

  ;; Address of right child of the tree.
  ;; This field is persistent.
  (right-child-address #f read-only #t)

  ;; Weight of this tree.
  ;; This field is persistent.
  (%weight #f read-only #t)

  ;; Where the persistent version is in the durable store.
  ;; This field is transient and reconstructed upon deserialization.
  (%address #f read-only #t)

  ;; Cached left child.
  ;; A transient copy of the deserialized left child.
  (left-child #f read-only #t)

  ;; Cached right child.
  ;; A transient copy of the deserialized right child.
  (right-child #f read-only #t)

  ;; The object map entry stored at the root of this tree.
  (object-map-entry #f read-only #t))

(define (persistent-tree/make-leaf durable-store entry)
  (let* ((weight 1)
         ;; Serialize the persistent information from the
         ;; object-map-entry.  (the object-id and the address).
         (address (primitive-serialize-leaf durable-store (object-map-entry/address entry))))
    (make-persistent-tree 0 0 weight address '() '() entry)))

(define (primitive-serialize-leaf durable-store entry-address)
  (write-leaf-record! 
   durable-store
   (lambda (oport)
     ;; store the delta because it is likely to be a small number.
     (write entry-address oport))))

(define (persistent-tree/make-branch durable-store left-child right-child entry)
  (if (and (null? left-child)
           (null? right-child))
      (persistent-tree/make-leaf durable-store entry)
      (let* ((weight (+ 1
                        (persistent-tree/weight left-child)
                        (persistent-tree/weight right-child)))
             (left-child-address (persistent-tree/address left-child))
             (right-child-address (persistent-tree/address right-child))
             ;; Serialize the addresses of the
             ;; left and right child, and the persistent information 
             ;; from the object-map-entry (the object-id and the address).
             (address 
              (primitive-serialize-branch durable-store
                                          left-child-address
                                          right-child-address
                                          (object-map-entry/address entry))))
        (make-persistent-tree left-child-address
                              right-child-address
                              weight
                              address
                              left-child
                              right-child
                              entry))))

(define (primitive-serialize-branch durable-store
                                    left-child-address
                                    right-child-address
                                    entry-address)
  (write-branch-record!
   durable-store
   (lambda (output-port)
     (write left-child-address output-port)
     (write-char #\space output-port)
     (write right-child-address output-port)
     (write-char #\space output-port)
     (write entry-address output-port))))

(define (persistent-tree/add durable-store root new-entry)
  (if (null? root)
      (persistent-tree/make-leaf durable-store new-entry)
      (let ((root-entry (persistent-tree/object-map-entry root))
            (left-child (persistent-tree/left-child root))
            (right-child (persistent-tree/right-child root)))
        (cond ((< (object-map-entry/object-id new-entry)
                  (object-map-entry/object-id root-entry))
               (persistent-tree/t-join 
                durable-store 
                (persistent-tree/add durable-store left-child new-entry)
                right-child
                root-entry))
              ((< (object-map-entry/object-id root-entry)
                  (object-map-entry/object-id new-entry))
               (persistent-tree/t-join
                durable-store 
                left-child
                (persistent-tree/add durable-store right-child new-entry)
                root-entry))
              (else
               (persistent-tree/make-branch durable-store 
                                            left-child right-child
                                            new-entry))))))

(define (persistent-tree/t-join durable-store left-child right-child entry)
  (let ((l.n (persistent-tree/weight left-child))
        (r.n (persistent-tree/weight right-child)))
    (cond ((< (+ l.n r.n) 2)
           (persistent-tree/make-branch durable-store left-child right-child entry))

          ((> r.n (* 5 l.n))
           (persistent-tree/l-join durable-store left-child right-child entry))

          ((> l.n (* 5 r.n))
           (persistent-tree/r-join durable-store left-child right-child entry))

          (else
           (persistent-tree/make-branch durable-store left-child right-child entry)))))

(define (persistent-tree/l-join durable-store left-child right-child entry)
  (if (< (persistent-tree/weight (persistent-tree/left-child right-child))
         (persistent-tree/weight (persistent-tree/right-child right-child)))
      (persistent-tree/single-l durable-store left-child right-child entry)
      (persistent-tree/double-l durable-store left-child right-child entry)))

(define (persistent-tree/single-l durable-store x r entry)
  (persistent-tree/make-branch 
   durable-store 
   (persistent-tree/make-branch durable-store 
                                x (persistent-tree/left-child r) entry)
   (persistent-tree/right-child r)
   (persistent-tree/object-map-entry r)))

(define (persistent-tree/double-l durable-store x r entry)
  (let ((r.l (persistent-tree/left-child r)))
    (persistent-tree/make-branch 
     durable-store
     (persistent-tree/make-branch durable-store
                                  x
                                  (persistent-tree/left-child  r.l)
                                  entry)
     (persistent-tree/make-branch durable-store
                                  (persistent-tree/right-child r.l)
                                  (persistent-tree/right-child r)
                                  (persistent-tree/object-map-entry r))
     (persistent-tree/object-map-entry r.l))))

(define (persistent-tree/r-join durable-store left-child right-child entry)
  (if (< (persistent-tree/weight (persistent-tree/right-child left-child))
         (persistent-tree/weight (persistent-tree/left-child left-child)))
      (persistent-tree/single-r durable-store left-child right-child entry)
      (persistent-tree/double-r durable-store left-child right-child entry)))

(define (persistent-tree/single-r durable-store l z entry)
  (persistent-tree/make-branch
   durable-store
   (persistent-tree/left-child l)
   (persistent-tree/make-branch durable-store
                                (persistent-tree/right-child l)
                                z
                                entry)
   (persistent-tree/object-map-entry l)))

(define (persistent-tree/double-r durable-store l z entry)
  (let ((l.r (persistent-tree/right-child  l)))
    (persistent-tree/make-branch 
     durable-store
     (persistent-tree/make-branch durable-store
                                  (persistent-tree/left-child  l)
                                  (persistent-tree/left-child  l.r)
                                  (persistent-tree/object-map-entry l))
     (persistent-tree/make-branch durable-store
                                  (persistent-tree/right-child l.r)
                                  z
                                  entry)
     (persistent-tree/object-map-entry l.r))))


In the next post, I will describe how we use this structure to implement transactions, simulate side-effects, and turn schema migration into a non-issue.

1 comment:

Faré said...

Separating concerns:

1- Transactions. Only need concern the root of the tree. You don't need to introduce transactions on anything complex.

2- Once you can persist large enough blobs, all complex structures are a matter of encoding - and integrity constraint enforcement.

3- There again, state is a matter of encoding on top of structures.

4- Schema evolution is a matter of maintaining a reflective representation of the type encodings and integrity constraints

5- I would probably insert that between 2 and 3, there again, a matter of encoding.