Friday, July 26, 2013

Fun with Audit Trails!

Gotta change direction here.  It's getting boring.  Let's implement an audit trail!  How hard could it be? We'll make an audit trail out of audit records.  Pick your favorite language and try this.

Exercise 1:  Implement an audit record.  Make sure it has these features:
  • a timestamp of when the record was created
  • a user-id to tell us who created the record
  • a reason why in the form of human readable text
  • immutable, including transitively reachable subobjects
  • Malformed records cannot be created.  Automatically set the timestamp so it cannot be forged.
Now we have to keep them somewhere safe, and get them back on occasion.

Exercise 2a: Make a repository with some sort of stable storage for a backing store, like a file.

Exercise 2b:  When creating an audit record, log it to stable storage. Make the repository be a required argument for creation of an audit record.  Log the record in the repository, but don't store info about the repository itself in the record.  The record doesn't have to know where it lives.

Exercise 2c:  Have a way to load all the audit records back in from stable storage.  Intern the audit records so re-loading is idempotent and eq-ness is preserved.

Exercise 3:  Implement random access to a repository's audit log (indexed by integer)

These are all very easy.  Don't worry, I'll make it harder.

6 comments:

João Távora - 6311 said...

What exactly do you mean by "immutable, including transitively reachable subobjects"?

Unknown said...

I could be wrong, of course, but "transitively reachable subobjects" to me means stuff like properties of the timestamp, or properties of the (no such thing as plain text) reason string.

The idea is that not only are you forbidden from setting (altering, changing) the timestamp property of an audit record, but you are also forbidden from setting any properties on that timestamp such as Seconds or UTC offset. (Assuming, of course, that such properties are exposed at all.)

(To an auditor, there's not much difference to between replacing the timestamp in its entirety and replacing it by parts. Either way the data has been altered, and would constitute a forgery.)

João Távora - 6311 said...

@Bryan Bates

Fair enough (though I still have doubts on the "trasitively"). How would one enforce such a thing in common lisp, make the :reader return a copy of the subobject?

Unknown said...

The transitive relationship is something like:

* Audit Records have a Timestamp. (A->B)
* Timestamps have a UTC offset. (B->C)
* Therefore, Audit Records transitively have a UTC offset. (A->C)

Unfortunately, I'm not experienced with Lisp (yet!), so I'm not sure exactly how to go about enforcing the restriction. Returning a copy sounds like the right idea, though.

Joe Marshall said...

I'm not exactly sure what you mean by enforce. If we're serious about calling these "audit records", then we can physically enforce immutability by writing them to durable write-once storage.

The software ought to faithfully model this. But it should also be convenient to use. So we compromise.

We use a Lisp string to represent the reason string in an audit record. The reason string in the record is immutable, but the representation as a Lisp object is not. If someone mutates the reason string, the Lisp copy won't match what is in durable storage.

We don't enforce immutability on the Lisp copy. We expect the programmer to leave it alone.

João Távora - 6311 said...

Yeah, you're right.

Anyway, I meant enforcing as in how the C++ compiler enforces the ideas of (logical) constness defined by the API author's use of the keyword const.

Here, I suggested having some kind of slot-value-ish hack to have the subobject readers of my immutable object return copies of the actual slot values. Maybe this could even be done with MOP and some kind of "immutable" metaclass or something.

It would not obviously totally prevent it, but it would be made harder. Just like in C++ we can still break const-correctness but have to explicitly use a cast.

Actually the reader-returns-copies idea is probably quite stupid, it would be miserably slow and would break (eq (reason-of auditx) (reason-of auditx)) and probably a million other things.

I could probably think of other techniques, but if I understand you correctly, then I like your logic of the "compromise". It's just that other languages take a more authoritarian stance.