Saturday, March 12, 2022

Obviously

It's kind of obvious that the representations you choose will influence the character of the code that uses them. If you represent things in lists, you will likely end up with code that recursively cdrs down them. If you represent things in an array, you will likely end up with code that iteratively walks through the array.

Sometimes there are several reasonable representation options and you have to make an engineering decision. Other times, there is one representation option that is obviously superior. But often enough there is one representation option that is obvious, but not necessarily superior. You don't want to pick a representation simply because it is obvious.

Let me give a concrete example. The problem is implementing a Tic Tac Toe server. Our junior programmer reasons as follows: the game is played on a three by three grid, which should obviously be a three by three array. This gives us aref and (setf aref) as primitives. We obviously need an add-mark! procedure that writes a mark into the grid at a location (and errors if the location is already marked).

(defun add-mark! (grid row column mark)
  (if (null (aref grid row column))
      (setf (aref grid row column) mark)
      (error "Cell occupied.")))

There is a bug here. If two threads call add-mark! on the same location simultaneously, one should succeed and the other should signal “Cell occupied.” The above code doesn't guarantee this.

So let's back up. Our more experienced programmer reasons like this: since we're writing a server we can expect multiple threads. An approach relying on mutable state will need synchronization mechanisms, but synchronization is a non-issue if we use immutable representations. If we decide on immutable representations, then we'll have operations like add-mark which returns a fresh object rather than mutating the existing one.

Rather than immediately pinning down a representation, our experienced programmer considers the Tic Tac Toe abstraction. What are the fundamental abstract operations? We need to be able to create an fresh game, make a mark, check to see if there are three in a row, check to see if any empty spaces are left, and print a grid. Certainly you can implement all of these if you are given aref and (setf aref), but you might choose to do things differently based on how the abstraction is going to be used. For example, if you needed to be able to rewind and replay a game, a list of moves might be a better representation.

Of course we consider a three by three array as a possible grid representation. But arrays are designed to be modified in place, so we're going to be pulled towards using them as mutable state. We have an engineering choice:

  • Implement non-destructive operations by copying the array when necessary.
  • Represent the grid differently, e.g., a pair of bitmaps.
  • Represent the entire game differently, e.g. as a list of moves
  • Implement a synchronization mechanism, e.g. a mutex
  • Punt, and let clients of the implementation figure out how to deal with the thread safety problem

A three by three array is an obvious choice for a Tic Tac Toe implementation. Depending on other considerations, though, it isn't obviously the best choice.

Monday, February 14, 2022

Symbols vs. Strings

Most popular computer languages don't have symbols as a data type. You can make do with a string, or encode your symbol as a small integer or enum. It's a hack, but simple enough and common enough that it doesn't need a second thought.

Lisp has symbols, though, so if you are using a string as a stand-in you should give it a second thought. In Lisp, symbols and strings have different roles. Strings are composite objects, symbols are atomic. String operations are concerned with the contents of the string, symbol operations are concerned with the identity of the symbol.

I saw a Common Lisp Tic Tac Toe program that represented the players' marks with the strings "X" and "O". I could see no reason why it couldn't have used the symbols 'X and 'O. Or the keywords :X and :O. Or how about the Unicode symbols '✗ and '◯?

Symbolic processing is the raison d'ĂȘtre of Lisp. It's a little absurd to represent symbols as strings in Lisp.

Saturday, February 5, 2022

Imperative vs. Declarative

I saw this recently:

    token = request.headers.get('Authorization')
    token = token.split(' ')[1]

From an imperative programming point of view, there is nothing wrong with this. We assign to the token variable the contents of the Authorization header, then we assign to the token variable the second substring after splitting along spaces. After executing these two statements, token will contain the desired value.

From a declarative programming point of view, this is terrible. The first statement binds the name token to the Authorization header, but the second statement contradicts the first by changing the binding of token to mean something else. Thinking declaratively, we'd much prefer either

    header = request.headers.get('Authorization')
    token = header.split(' ')[1]
or
    token = request.headers.get('Authorization').split(' ')[1]

The first option avoids the contradition and reassignment by simply using a separate variable. The item we get from request.headers isn't a token, it's a header, so we should name the variable appropriately. The token is a separate quantity that we compute from the header and the code directly reflects that.

The second option just avoids the intermediate variable and lets the compiler choose how to deal with it. Again, there is no contradiction or reassignment, token receives its final value when it is bound.

The problem is this: in the original pair of statements, the first statement

    token = request.headers.get('Authorization')
while imperatively a valid command, is declaratively a lie. It says that token is literally the Authorization header, but it isn't. The second statement patches things up by fixing the value of token to be what it ought. It seems poor practice to deliberately put these sorts of lies in our programs. We can make the first statement true by simply renaming the variable.

Thursday, January 6, 2022

Idle puzzles 2: Revenge of the Shift

The idle puzzles got some web traffic, so here are a couple more in the same vein. Not much new, just a variation on a theme. They can be done in your head, but I spent a few minutes coding up some solutions to see what was involved.

In the previous puzzles, you were given these numeric primitives:

(import 'cl:zerop (find-package "PUZZLE"))
(defun puzzle::shr (n) (floor n 2))
(defun puzzle::shl0 (n) (* n 2))
(defun puzzle::shl1 (n) (1+ (* n 2)))
and the task was to implement basic arithmetic on non-negative integers.

These puzzles extend the task to include negatve numbers. We are given one additional primitive:

(defun puzzle::-1? (n) (= n -1))

If you want challenge yourself, you could speedrun the problems from start to finish. Or try to adapt the solutions to the prior puzzles with the minimal amount of editing and new code (minimize the diff). Another thing you could try is instrumenting shr, shl0, and shl1 to count the amount of shifting taking place and try to minimize that.

Here is the puzzle:

;;; -*- Lisp -*-

(defpackage "PUZZLE"
  (:use)
  (:import-from "COMMON-LISP"
                "AND"
                "COND"
                "DEFUN"
                "IF"
                "FUNCALL"
                "FUNCTION"
                "LAMBDA"
                "LET"
                "MULTIPLE-VALUE-BIND"
                "NIL"
                "NOT"
                "OR"
                "T"
                "VALUES"
                "ZEROP"))

(defun puzzle::shr (n) (floor n 2))
(defun puzzle::shl0 (n) (* n 2))
(defun puzzle::shl1 (n) (1+ (* n 2)))

(defun puzzle::-1? (n) (= n -1)) ;; new primitive

(in-package "PUZZLE")

;;; You can only use the symbols you can access in the PUZZLE package.

;;; Problem -1 (Example).  Fix = to handle negative numbers.

(defun = (l r)
  (cond ((zerop l) (zerop r))
        ((-1? l) (-1? r))     ;; new base case
        ((zerop r) nil)
        ((-1? r) nil)         ;; new base case
        (t (multiple-value-bind (l* l0) (shr l)
             (multiple-value-bind (r* r0) (shr r)
               (if (zerop l0)
                   (and (zerop r0)
                        (= l* r*))
                   (and (not (zerop r0))
                        (= l* r*))))))))

;;; Problem 0.  Implement minusp and plusp.

;;; Problem 1.  Fix > to handle negative numbers.

;;; Problem 2.  Fix inc and dec to handle negative numbers.

;;; Problem 3.  Implement logand, logior, and logxor.

;;; Problem 4.  Implement neg (unary minus).

;;; Problem 5.  Fix add and sub to handle negative numbers.

;;; Problem 6.  Fix mul to handle negative numbers.

;;; Problem 7.  Implement floor and ceiling for both positive and
;;;             negative numbers

My Solutions

The reason we're given a new primitive, -1?, is because the shr function has two fixed points: 0, and -1. So when we write code that recurs over a shr, the recursion is going to bottom out in one of those two base cases and we need to distinguish between them. The earlier puzzles ensured we'd bottom out at zero by specifying non-negative numbers, but if we allow negative numbers, our recursions could bottom out at -1.

;;; Problem 0.  Implement minusp and plusp

(defun minusp (n)
  (cond ((zerop n) nil)  
        ((-1? n) t)
        (t (minusp (shr n)))))

(defun plusp (n)
  (not (or (zerop n)
           (minusp n))))

We have to handle both base cases for both the arguments:

;;; Problem 1.  Fix > to handle negative numbers.

(defun > (l r)
  (cond ((zerop l) (minusp r))
        ((-1? l) (and (minusp r)
                      (not (-1? r))))
        ((or (zerop r)
             (-1? r)) (plusp l))
        (t (multiple-value-bind (l* l0) (shr l)
             (multiple-value-bind (r* r0) (shr r)
               (if (and (not (zerop l0)) (zerop r0))
                   (not (> r* l*))
                   (> l* r*)))))

This is interesting. The base case handles when one or the other argument is 0 or -1, but the recursive case doesn't know if the arguments are positive or negative. It doesn't seem to care, either. What is going on? This is the result of using floor on a negative number. The remainder is still a positive number, so when we operate on l0 and r0 we treat them as positive numbers regardless of whether l or r are positive or negative.

;;; Problem 2.  Fix inc and dec to handle negative numbers.

(defun inc (n)  
  (if (-1? n)
      0
      (multiple-value-bind (n* n0) (shr r)
         (if (zerop n0)
             (shl1 n*)
             (shl0 (inc n*))))))

(defun dec (n)  
  (if (zerop n)
      -1
      (multiple-value-bind (n* n0) (shr r)
         (if (zerop n0)
             (shl1 (dec n*))
             (shl0 n*)))))

Well that was easy, we just had to handle the zero crossing.

No doubt you've noticed that shr shifts a number to the right as if it were held in a register. If you shift the bits out of a negative number, you will notice that the bits come out as if the number were "stored" in two's complement form, with negative numbers being infinitely extended to the left with 1s. This is curious because we didn't design or choose a two's complement representation, it just sort of appears.

;;; Problem 3.  Implement logand, logior, and logxor.

(defun logand (l r)
  (cond ((zerop l) 0)
        ((-1? l) r)
        ((zerop r) 0)
        ((-1? r) l)
        (t (multiple-value-bind (l* l0) (shr l)
             (multiple-value-bind (r* r0) (shr r)
               (if (or (zerop l0) (zerop r0))
                   (shl0 (logand l* r*))
                   (shl1 (logand l* r*))))))))

(defun logior (l r)
  (cond ((zerop l) r)
        ((-1? l) -1)
        ((zerop r) l)
        ((-1? r) -1)
        (t (multiple-value-bind (l* l0) (shr l)
             (multiple-value-bind (r* r0) (shr r)
               (if (and (zerop l0) (zerop r0))
                   (shl0 (logior l* r*))
                   (shl1 (logior l* r*))))))))

(defun complement (n)
  (cond ((zerop n) -1)
        ((-1? n) 0)
        (t (multiple-value-bind (n* n0) (shr n)
             (if (zerop n0)
                 (shl1 (complement n*))
                 (shl0 (complement n*)))))))

(defun logxor (l r)
  (cond ((zerop l) r)
        ((-1? l) (complement r))
        ((zerop r) l)
        ((-1? r) (complement l))
        (t (multiple-value-bind (l* l0) (shr l)
             (multiple-value-bind (r* r0) (shr r)
               (if (or (and (zerop l0) (zerop r0))
                       (and (not (zerop l0)) (not (zerop r0))))
                   (shl0 (logxor l* r*))
                   (shl1 (logxor l* r*))))))))

;;; Problem 4.  Implement neg (unary minus).

(defun neg (n)
  (cond ((zerop n) 0)
        ((-1? n) 1)
        (t (multiple-value-bind (n* n0) (shr n)
             (if (zerop n0)
                 (shl0 (neg n*))
                 (shl1 (complement n*)))))))

This is basically (inc (complement n)), which is how you negate a two's complement number, but inc and the complement step have been folded together to reduce the amount of shifting.

;;; Problem 5.  Fix add and sub to handle negative numbers.

(defun add (l r)
  (cond ((zerop l) r)
        ((-1? l) (dec r))
        ((zerop r) l)
        ((-1? r) (dec l))
        (t (multiple-value-bind (l* l0) (shr l)
             (multiple-value-bind (r* r0) (shr r)
               (if (zerop l0)
                   (if (zerop r0)
                       (shl0 (add l* r*))
                       (shl1 (add l* r*)))
                   (if (zerop r0)
                       (shl1 (add l* r*))
                       (shl0 (addc l* r*)))))))))

(defun addc (l r)
  (cond ((zerop l) (inc r))
        ((-1? l) r)
        ((zerop r) (inc l))
        ((-1? r) l)
        (t (multiple-value-bind (l* l0) (shr l)
             (multiple-value-bind (r* r0) (shr r)
               (if (zerop l0)
                   (if (zerop r0)
                       (shl1 (add l* r*))
                       (shl0 (addc l* r*)))
                   (if (zerop r0)
                       (shl0 (addc l* r*))
                       (shl1 (addc l* r*)))))))))

(defun sub (l r) (add l (neg r)))

The great thing about two's complement is that you can handle negative numbers without changing how you handle the low order bits. For add and addc, I only had to add the two additional base cases for l or r being -1.

By the way, you shouldn't define sub this way. It's double the number of shifts.

;;; Problem 6.  Fix mul to handle negative numbers.

(defun fma (l r a)
  (cond ((zerop r) a)
        ((-1? r) (sub a l))   ;; added this line
        (t (multiple-value-bind (r* r0) (shr r)
             (fma (shl0 l) r* (if (zerop r0) a (add l a)))))))

(defun mul (l r) (fma l r 0))

Two's complement to the rescue again. The loop can treat the low-order bits of r the same way regardless of whether r is positive or negative.

;;; Problem 7.  Implement floor and ceiling for both positive and
;;;             negative numbers

(defun floor0 (n d)
  (if (> d n)
      (values 0 n)
      (multiple-value-bind (q r) (floor0 n (shl0 d))
        (if (> d r)
            (values (shl0 q) r)
            (values (shl1 q) (sub r d))))))

(defun ceil0 (n d)
  (if (not (> n d))
      (values 1 (sub n d))
      (multiple-value-bind (q r) (ceil0 n (shl0 d))
        (let ((r1 (add d r)))
          (if (plusp r1)
              (values (shl0 q) r)
              (values (dec (shl0 q)) r1))))))

(defun floor (n d)
  (if (minusp n)
      (multiple-value-bind (q r) (ceiling (neg n) d)
        (values (neg q) (neg r)))
      (if (minusp d)
          (multiple-value-bind (q r) (ceil0 n (neg d))
            (values (neg q) r))
          (floor0 n d))))

(defun ceiling (n d)
  (if (minusp n)
      (multiple-value-bind (q r) (floor (neg n) d)
        (values (neg q) (neg r)))
      (if (minusp d)
          (multiple-value-bind (q r) (floor0 n (neg d))
            (values (neg q) r))
          (ceil0 n d))))

My original method for division doesn't work too well with negative numbers. I worked around that by converting the problem to positive numbers and converting the answer back to negative numbers where appropriate. Supporting negative numbers for division is an exercise in combinatorics. All this checking for minusp and calls to neg cause a lot of shifting of numbers. There is no doubt a better way, but my brain hurts now.