Thursday, January 13, 2011

Some Bells and Whistles

There are a few bells and whistles that I added to my persistent store. The first is multiple extents (that is, more than one persistent store open and participating in a transaction). This is fairly easy to add. The only tricky part is ensuring that a transaction that involves multiple extents either fully commits or aborts across all participating stores. The way I chose to implement this is by choosing a single particular persistent store to be the transaction master. The commit record in the transaction master is the one that holds the authoritative record of whether the entire transaction has committed.

The commit proceeds in two phases. In the first phase, each non-master store performs a ‘conditional commit’. This involves flushing the pending writes to the backing store and writing a special commit record that refers to the transaction master. Once each non-master store has written the special commit record, the commit record is written to the master store.

If nothing unusual happens, processing may continue as in the normal case. The interesting thing happens when the non-master store is first opened. If the most recent commit record in a store is a ‘conditional commit’, then it is necessary to check the transaction master to see if it committed. Typically, the relationship between the different stores is that there is one master and one or more ‘satellite’ stores. Therefore, it is likely that the transaction master is already open and within a transaction when the satellite store is opened, so the master commit record will be easily at hand. If not, then the code recursively opens the master store to determine if the conditional commit of the satellite is valid. This adds a little amount of bookkeeping to the commit code, but it isn't particularly complex.

Nested Transactions and Multi-version Concurrency Control (MVCC)

My original plan had been to avoid the complexities of nested transactions and MVCC. I designed this as a simple store, and I figured that a good number of applications would not need nested transactions. I discovered almost immediately that I was wrong. If nested transactions are disallowed and a procedure needs to start a transaction, it has to determine whether an existing transaction is in progress. If it is not, then the procedure needs to begin the transaction and ensure it is committed when it exits. On the other hand, if there is an existing transaction, the procedure should ‘join’ that transaction and let whoever started the transaction decide whether or not to commit. This isn't so bad if you can segregate your procedures into ‘top-level’ (ones that always start and commit a transaction) and ‘subroutines’ (ones that never start and commit a transaction, but expect there to be one in place before being called). Unfortunately, recursive procedures cannot be in both categories simultaneously. A possible workaround is to have duplicate code tailored for each use case, but then the programmer would have to remember which version to call under which circumstances, or we'd have to avoid using recursion. Since this is clearly unacceptable, I bit the bullet and implemented nested transactions.

With the current design, nested transactions turn out to be pretty simple. When a non-nested transactions starts, it gets a copy of the most recent object-map which it uses for OID resolution. When a nested transaction starts, it gets a copy of the current object-map from the immediately enclosing transaction. The nested transaction gets to see the intermediate state of the enclosing transaction. If the nested transaction commits, it does not write a commit record to the store. Instead, the object-map from the nested transaction is used as the object-map of the enclosing transaction when control is returned. Thus the enclosing transaction gets to see the state that the nested transaction committed. When the enclosing transaction commits, the object-map (which includes the nested transaction updates) is written to the store and committed.

If the enclosing transaction aborts, the nested transaction state is simply dropped on the floor and becomes inaccessible. The interesting question is how to handle the case where the nested transaction aborts. In this case, we return control to the enclosing transaction, but we discard the object-map from the nested transaction. The enclosing transaction therefore sees no effect at all from the nested transaction. The enclosing transaction has the opportunity to itself abort (in which case all the intermediate state is discarded), to commit (in which case only the effect of the enclosing transaction is made permanent), or to perhaps continue processing and begin more nested transactions.

In effect, nested transactions act like software transactional memory.

At this point, one can see how MVCC can be easily implemented. Each concurrent thread runs its own transactions with its own object-map. Since all OID resolution is performed against the object-map associated with the current (thread local) transaction, each concurrent transaction sees only its view of the persistent store. (This is a necessary condition, but it is not sufficient. We have to ensure that committed transactions can be given a serial order to preserve the isolation between transactions. We do this by recording in each transaction the OIDs of every persistent object read and written within the transaction. When the transaction attempts to commit, the set of reads and writes is compared against the set of objects read and written by any other parallel transaction that has committed since this one started. If there is an overlap, we abort. That is, we use optimistic locking.)

(This should answer Fare's question about why OID resolution is performed on each read rather than being done just once when the store is opened.)

These features involve adding a good chunk of bookkeeping code, but the code is straightforward, albeit tedious. Good testing code is vital to ensuring that this all works as it should.

No comments: