Saturday, March 12, 2022

Obviously

It's kind of obvious that the representations you choose will influence the character of the code that uses them. If you represent things in lists, you will likely end up with code that recursively cdrs down them. If you represent things in an array, you will likely end up with code that iteratively walks through the array.

Sometimes there are several reasonable representation options and you have to make an engineering decision. Other times, there is one representation option that is obviously superior. But often enough there is one representation option that is obvious, but not necessarily superior. You don't want to pick a representation simply because it is obvious.

Let me give a concrete example. The problem is implementing a Tic Tac Toe server. Our junior programmer reasons as follows: the game is played on a three by three grid, which should obviously be a three by three array. This gives us aref and (setf aref) as primitives. We obviously need an add-mark! procedure that writes a mark into the grid at a location (and errors if the location is already marked).

(defun add-mark! (grid row column mark)
  (if (null (aref grid row column))
      (setf (aref grid row column) mark)
      (error "Cell occupied.")))

There is a bug here. If two threads call add-mark! on the same location simultaneously, one should succeed and the other should signal “Cell occupied.” The above code doesn't guarantee this.

So let's back up. Our more experienced programmer reasons like this: since we're writing a server we can expect multiple threads. An approach relying on mutable state will need synchronization mechanisms, but synchronization is a non-issue if we use immutable representations. If we decide on immutable representations, then we'll have operations like add-mark which returns a fresh object rather than mutating the existing one.

Rather than immediately pinning down a representation, our experienced programmer considers the Tic Tac Toe abstraction. What are the fundamental abstract operations? We need to be able to create an fresh game, make a mark, check to see if there are three in a row, check to see if any empty spaces are left, and print a grid. Certainly you can implement all of these if you are given aref and (setf aref), but you might choose to do things differently based on how the abstraction is going to be used. For example, if you needed to be able to rewind and replay a game, a list of moves might be a better representation.

Of course we consider a three by three array as a possible grid representation. But arrays are designed to be modified in place, so we're going to be pulled towards using them as mutable state. We have an engineering choice:

  • Implement non-destructive operations by copying the array when necessary.
  • Represent the grid differently, e.g., a pair of bitmaps.
  • Represent the entire game differently, e.g. as a list of moves
  • Implement a synchronization mechanism, e.g. a mutex
  • Punt, and let clients of the implementation figure out how to deal with the thread safety problem

A three by three array is an obvious choice for a Tic Tac Toe implementation. Depending on other considerations, though, it isn't obviously the best choice.