Friday, December 17, 2010

Transactions, part I

Faré has persuaded me to post actual working and tested code. It seems that enough has changed since I switched computers and the code that was working is working no longer. (The problem has nothing to do with my code — a shared library referred to by Scheme seems to have moved and Scheme came crashing down.) I'll get things working and show the live code. Until then, we'll have to make do with illustrative code fragments.

Given the index structure I described over the past two days, you can easily see that we have solved the problem of internal and external references. We use OIDs as a sort of ‘pointer’, and we modify the access mechanisms to hide the redirection. An analagous modification to make-instance gives us aggregate objects by initializing the instance slots with the appropriate OIDs.

The next thing to deal with is transactions. These are very easy to implement using this design. We have provided no mechanism to rewrite any data in the backing store; it is opened in append-only mode. When we augment the object-map to associate new OIDs with new objects, we construct new object-map nodes and append them to the backing store, but we do not modify any existing object-map nodes. If we were to ignore the new object-map and re-use the old one, we would not be able to access the new objects. The entire object graph would appear to us as if time had stopped at the point when the old object-map was created. This is the key to implementing transactions.

When we begin a transaction, we create a transaction object and copy the most recent object-map root node to it. As we create objects we update the transaction object with augmented object-map. If the transaction fails, we simply discard it. The next transaction is initialized with either the new object-map (if the prior transaction committed) or the old object-map (if the prior transaction aborted). There is no need to ‘undo’ anything because persistent objects cannot be mutated.

When the computer shuts down (either by a crash or planned outage) and is restarted, we use the information in the backing store to restore the state of the program. We need to find the most recent, but valid, root of the object-map. Since we cannot plan crashes, we might find objects and object-map nodes at the tail of the backing store that were written by a transaction that was in progress when the crash happened. We want to ignore these because they represent intermediate state. The trick is this: when a transaction commits we append a special ‘commit record’ to the backing store. The commit record will contain the location (within the store) where the root node of the object-map is stored. When the process restarts, it scans the backing store to find the latest valid commit record and it initializes from there. Any objects that were allocated after that commit record are simply ignored.

If transactions usually commit and the computer rarely crashes, then the most recent commit record will usually be the very last record in the backing store. If the computer crashes a lot, or if transactions are usually aborted, then it is likely that the backing store would accumulate abandoned objects. One could develop a way of reclaiming that storage.

Here is an actual commit record:
 (:record-format 0.4)
 (:timestamp 3331389370)
 (:object-map-offset 5971200)
 (:previous-record-offset 5970432)
 (:reason "Update user context.")
The commit record has start and end markers so we can identify incomplete commit records. The object-map is at location 5971200 in the backing store, and the previous commit record is at location 5970432. (For extra safety, the commit record should contain a CRC or hash of the object-map. It should probably contain a CRC or hash of itself in order to make it self-validating. Or you can punt.)

The :reason is just a string. It is not used for anything, but it is extremely important nonetheless. The :reason gives us an audit trail in the event of a catastrophic failure. This is part of the “belt and suspenders” safety mechanism. The backing store may kept for a long time, perhaps even many years. Although we will protect the data with multiple mechanisms to reduce the probability of simultaneous failures, our last line of defense is to manually inspect the backing store records. The :reason field will make this much easier.

Ok, now for some code to illustrate. The actual code has a number of bells and whistles, and I'll show them later, but it is best to start with a stripped down version.
(defconstant *transaction-dispositions*
  '(:running :committing :aborting :committed :aborted))

;; running = transaction should commit upon return,
;;           but abort if a throw occurs
;; aborting = transaction is still running, but it must abort upon exit
;; committing = transaction is still running, but it must commit
;;              upon exit, even if throwing
;; aborted = transaction is over, it aborted
;; committed = transaction is over, it committed
(deftype transaction-disposition () `(member ,@*transaction-dispositions*))

(eval-when (:load-toplevel :compile-toplevel :execute)
  (defvar *current-transaction*)
  (setf (documentation '*current-transaction* 'variable)
        "The current transaction.  Not bound if we are not in a transaction."))

;;; Transactions
(defclass transaction ()
  ((disposition :accessor transaction/disposition
                :initform :running
                :type transaction-disposition)
   (reason      :accessor transaction/reason
                :initarg :reason
                :initform "Unknown"
                :type string)
   (object-map :accessor transaction/object-map
               :initarg  :object-map)))
The code for running a transaction is conceptually simple, but the bells and whistles obscure the simplicity. I'll present a stripped-down version first and then show the full version in a later post.
(defun call-with-transaction (pstore reason receiver)
  (let ((transaction   nil)
        (normal-return nil)
        (abort-reason  "Non-local exit from call-with-transaction."))
            ;; This is the only place that transactions are created.
            (setq transaction
                   :object-map (persistent-store/object-map pstore)
                   :reason reason))
                (let ((*current-transaction* transaction))
                   (funcall receiver transaction)))
              (setq normal-return t)))

        ;; If the transaction is null, we never even got started.
        (unless (null transaction)
          (ecase (transaction/disposition transaction)
             ;; If we return normally and no one decided to commit
             ;; or abort before now, we commit.
            (:running (if normal-return
                          (transaction/do-commit transaction)
                            (setf (transaction/reason transaction) abort-reason)
                            (transaction/do-abort  transaction))))
            (:committing (transaction/do-commit transaction))
            (:aborting   (transaction/do-abort  transaction))))))

;; Aborting is simple.
(defun transaction/do-abort (transaction)
  "Perform the necessary work to abort the transaction."
  (assert (or (eq (transaction/disposition transaction) :aborting)
              (eq (transaction/disposition transaction) :running)))
  (setf (transaction/disposition transaction) :aborted))

;; Committing is simple, too.  Just write a commit record.
(defun transaction/do-commit (transaction)
  "Perform the necessary work to commit the transaction."
  (assert (or (eq (transaction/disposition transaction) :committing)
              (eq (transaction/disposition transaction) :running)))
   (let* ((object-map (transaction/object-map transaction))
          (pstore (object-map/pstore object-map)))
     (persistent-store/commit pstore
         :object-map object-map
         :reason (transaction/reason transaction))
     (setf (transaction/disposition transaction) :committed)))
More posts to follow...

No comments: