Wednesday, July 31, 2024

Continuation passing style resource management

One good use of continuation passing style is to manage dynamic resources. The resource allocation function is written in continuation passing style and it takes a callback that it invokes once it has allocated and initialized the resource. When the callback exits, the resource is uninitialized and deallocated.

(defun call-with-resource (receiver)
  (let ((resource nil))
    (unwind-protect
        (progn
          (setq resource (allocate-resource))
          (funcall receiver resource))
      (when resource
        (deallocate-resource resource)))))

  ;; example usage:
  (call-with-resource
    (lambda (res)
      (do-something-with res)))

;;; In Lisp, we would provide a convenient WITH- macro
(defmacro with-resource ((resource) &body body)
  ‘(CALL-WITH-RESOURCE (LAMBDA (,resource) ,@body)))

  ;; example usage:
  (with-resource (res)
    (do-something-with res))

This pattern of usage separates and abstracts the resource usage from the resource management. Notice how the unwind-protect is hidden inside call-with-resource so that the user of the resource doesn’t have to remember to deallocate the resource.

The with-resource macro is idiomatic to Lisp. You obviously can’t provide such a macro in a language without macros, but you can still provide the call-with-resource function.

Continuation passing style for resource management can be used in other languages, but it often requires some hairier syntax. Because call-with-resource takes a callback argument, it is actually a higher-order function. The syntax for passing higher-order functions in many languages is quite cumbersome. The return value of the callback becomes the return value of call-with-resource, so the return type of the callback must be compatible with the return type of the function. (Hence the type of call-with-resource is actually parameterized on the return value of the callback.) Languages without sophisticated type inference may balk at this.

Another advantage of the functional call-with-resource pattern is that you can dynamically select the resource allocator. Here is an example. I want to resolve git hashes against a git repository. The git repository is large, so I don’t want to clone it unless I have to. So I write two resource allocators: CallCloningGitRepository, which takes the URL of the repository to clone, and CallOpeningGitRepository which takes the pathname of an already cloned repository. The "cloning" allocator will clone the repository to a temporary directory and delete the repository when it is done. The "opening" allocator will open the repository and close it when it is done. The callback that will be invoked won’t care which allocator was used.

Here is what this looks like in golang:

// Invoke receiver with a temporary directory, removing the directory when receiver returns.
func CallWithTemporaryDirectory(dir string, pattern string, receiver func(dir string) any) any {
	dir, err := os.MkdirTemp(dir, pattern)
	CheckErr(err)
	defer os.RemoveAll(dir)

	return receiver(dir)
}

// Invoke receiver with open git repository.
func CallOpeningGitRepository(repodir string, receiver func(string, *git.Repository) any) any {
	repo, err := git.PlainOpen(repodir)
	if err != nil {
		log.Fatal(err)
	}
	return receiver(repodir, repo)
}

// Invoke receiver with a cloned git repository, removing the repository when receiver returns.
func CallCloningGitRepository(dir string, pattern string, url string, receiver func(tmpdir string, repo *git.Repository) any) any {
	if url == "" {
		return nil
	}
	return CallWithTemporaryDirectory(
		dir,
		pattern,
		func(tempdir string) any {
			log.Print("Cloning " + url + " into " + tempdir)
			repo, err := git.PlainClone(tempdir, true, &git.CloneOptions{
				Auth: &gitHttp.BasicAuth{
					Username: username,
					Password: password,
				},
				URL:      url,
				Progress: os.Stdout,
			})
			CheckErr(err)
			log.Print("Cloned.")
			return receiver(tempdir, repo)
		})
}

You specify a repository either with a URL or a pathname. We select the appropriate resource allocator based on whether the specifier begins with "https".

func RepositoryGetter (specifier string) func (receiver func(_ string, repo *git.Repository) any) any {
	if strings.EqualFold(specifier[0:5], "https") {
                return GetRemoteGitRepository (specifier)
	} else {
                return GetLocalGitRepository (specifier)
	}
}

func GetRemoteGitRepository(url string) func(receiver func(_ string, repo *git.Repository) any) any {
	return func(receiver func(_ string, repo *git.Repository) any) any {
		return CallCloningGitRepository("", "git", url, receiver)
	}
}

func GetLocalGitRepository(repodir string) func(receiver func(_ string, repo *git.Repository) any) any {
	return func(receiver func(_ string, repo *git.Repository) any) any {
		return CallOpeningGitRepository(repodir, receiver)
	}
}

To open a repository, we call RepositoryGetter(specifier) to get a getRepository function. Then we invoke the computed getRepository function on a receiver callback that accepts the local pathname and the repository:

  getRepository := RepositoryGetter(specifier)

  return getRepository(
         func (_ string, repo *git.Repository) any {
                 // resolve git hashes against the repository
                 ....
                 return nil
         })

If given a URL, this code will clone the repo into a temporary directory and open the cloned repo. If given a pathname, it will just open the repo at the pathname. It runs the callback and does the necessary cleanup when the callback returns.

The biggest point of confusion in this code (at least to me) are the type specifiers of the functions that manipulate the resource allocators. Static types don’t seem to mix well with continuation passing style.

Wednesday, July 10, 2024

Monkeys vs. Shakespeare

Google's golang was designed for mediocre programmers that aren't “capable of understanding a brilliant language”. It omits features that are hard to understand or hard to use, like conditional expressions, macros, exceptions, and object systems.

Lisp, on the other hand, was originally designed for smart programmers in order to research artificial intelligence. It contains language constructs like conditional expressions, a powerful macro system, a highly customizable error handling system, and a sophisticated object system.

A million monkeys typing on a million keyboards will eventually produce the works of Shakespeare. Lisp is a language for Shakespeares, while golang is a language for monkeys.

What is the fundamental problem we are solving? If the problem is simply an insufficient amount of software, then we need to write as much software as possible as rapidly as possible. Hire more monkeys. If the problem is gainfully employing all the monkeys quickly, give them crude tools that are easy to use. But if the problem is lack of quality software, then we need the tools to write the best software possible. Hire more Shakespeares and give them powerful tools.

Hiring monkeys is easier and cheaper than hiring Shakespeares. So why not just keep hiring monkeys until you have a Shakespeare equivalent? Experience shows how well this works. Just think of all the projects that were brought in on time and under budget when they doubled the number of monkeys. Just think of any project that was brought in on time and under budget when they doubled the number of monkeys. Surely there must be one?

Thursday, July 4, 2024

A YouTuber's First Look at Lisp

There is a guy who is making a series of videos where he takes a “first look” at various programming languages. I watched his “first look” at Lisp. Now he wasn’t going into it completely cold — he had looked at Scheme previously and had seen some Lisp — but it was still an early experience for him.

He gave it a good go. He didn’t have a pathological hatred for parenthesis and seemed willing to give it a fair shake. He downloaded SBCL and started playing with it.

Since he only had allocated an hour to it, he didn't cover much. But he encountered a few pitfalls that Lisp newbies often fall into. I found myself wanting to talk back at the screen and point out where he was going wrong when simple typos were causing errors.

For example, he was trying to format some output, but he forgot the closing double quote on the format string. The reader kept waiting for him to finish the string and simply gobbled up the format args and closing parens as part of the string. He was confused as to why nothing was happening. When he finally typed Control-C, he was faced with a spew of stack trace and a debugger prompt that was of no help to him.

A second problem he encountered was when he typed (print (first (*mylist*))). Any experienced Lisp hacker will obviously see the problem right away: *mylist* is not a function. But the error message was buried in a 30 level deep stack trace that wound back through the reader and REPL and completely obscured the problem.

The Lisp debugger is amazing, but it can be a bit overwhelming to a newbie. When I was TAing Lisp classes, I would often find that students were intimidated by the debugger. They felt that while they were in the debugger there was some sort of impending doom hanging over them and that they had to exit and get back to the normal REPL as quickly as possible. I think if I were teaching a Lisp course now, I would start off by deliberately causing errors and then showing students how to use the debugger to find the problem and recover from it.

PLT Scheme has an interesting “beginner” mode that disables some of the more advanced features of Lisp. Many typos in Lisp cause errors not because they are invalid, but because they mean something advanced, but unintended. For example, the beginner mode disables first-class functions, so accidentally putting in extra parenthesis becomes a syntax error instead of an attempt to funcall something inappropriate. Perhaps a “training wheels” approach would work with Lisp beginners.

But SBCL generates pretty good error messages. It was a case of TLDR. The reviewer just didn't take the time to read and understand what the error message said. He was too busy trying to figure out how to get back to the REPL.

All in all, the reviewer gave it a good go and came away with a positive impression of Lisp. He also took he time to research the language and provided some examples of where Lisp is used today. The video is at https://www.youtube.com/watch?v=nVhzHu2zSEQ if you are interested.

Wednesday, June 26, 2024

Goto not that harmful

I use goto all the time. When I have a loop, I goto the top of the loop after each iteration. When a function delegates to another function, I just goto the other function.

Gotos that just transfer control have the problem that the context is implicit. It isn’t obvious from the code what parts of the context are expected to persist across the control transfer. But if you explicitly pass arguments along with your control transfer, then you can see exactly what is carried across the control transfer.

A tail call isn’t “optimized” to a goto, it is a goto. It is a goto that passes arguments. Gotos that pass arguments aren’t harmful.

Monday, June 24, 2024

Embrace the Suck

The key point here is our programmers are Googlers, they’re not researchers. They’re typically fairly young, fresh out of school, probably learned Java, maybe learned C or C++, probably learned Python. They’re not capable of understanding a brilliant language but we want to use them to build good software. So, the language that we give them has to be easy for them to understand and easy to adopt.
— Rob Pike

How sad. Google used to hire top-notch programmers. Now it appears that they have hired a bunch of mediocre programmers who can't even learn how to properly use exceptions or ternary conditionals. Programmers that need “training wheels” on their C.

And how insulting to Googlers to be told that they are not capable of understanding a brilliant language. To be seen as merely a fungible resource to be “used” to build software. I'm sure some of them are capable of understanding a brilliant language. I'm sure that some are capable of learning how to use Haskell or OCaml or Clojure (or F# or Rust or Erlang or Scala) whatever language is best for the task.

Does success come to those that strive for excellence or those that embrace mediocrity?

Sunday, June 16, 2024

Decode a Float (Solution)

We can multiply or divide a floating point number by 2 without changing the bits in the mantissa. So we can rescale the number to be in the range [1, 2) by repeatedly multiplying or dividing by 2. The leftmost bit of the mantissa is always 1, but next bit determines whether the number is in the top half of the range or the bottom half. So if the number is equal to or greater than 1.5, the next bit is 1, otherwise it is 0.

If the number is in the range [1, 2), we can subtract 1 from it. The remaining bits will be shifted left until the leftmost bit is 1. If the number is greater than or equal to 1.5, then subtracting 1 will shift the bits left by one. But if the number is in [1, 1.5), we won’t know how many zero bits will be shifted out when we subtract 1. What we’ll do is add .5 to the number to turn on the next bit, then subtract 1 to shift the bits left by one.

Here is the code in Common Lisp:

(defun float->bits (float)
  (cons 1 (read-bits float 0)))

(defun read-bits (float count)
  (cond ((>= count 52) nil)
        ((> float 2.0d0) (read-bits (/ float 2.0d0) count))
        ((< float 1.0d0) (read-bits (* float 2.0d0) count))
        ((>= float 1.5d0) (cons 1 (read-bits (- float 1.0d0) (1+ count))))
        (t (cons 0 (read-bits (- (+ float .5d0) 1.0d0) (1+ count))))))

Note that all these operations are exact and do not cause round off.

Saturday, June 15, 2024

Decode a Float

The leftmost bit of any positive binary number is always 1. So if you were to left-justify a positive binary number, the top bit would always be 1. If the top bit is always 1, there is no need to implement it. Floating point numbers use this trick.

You can determine the bits of an integer using only arithmetic operations by repeatedly dividing by two and collecting the remainders. Today’s puzzle is to determine the bits of a floating point number using only arithmetic operations (no decode-float or integer-decode-float).

Thursday, June 6, 2024

D-day, 80 years ago today

More than 150,000 troops, 5,000 ships, 13,000 aircraft in the largest amphibious assault in history.

Wednesday, June 5, 2024

Multithreading and Immutable Data

I was amusing myself by looking at Lisp tutorials. They used the idea of a Tic-Tac-Toe service as a motivating example. You’d be able to play Tic-Tac-Toe against the computer or another opponent.

My immediate thought went to the issue of multithreading. If you were going to serve hundreds of people at once, you’d need to have a multi-threaded service. Multi-threaded code is hard to write and debug, and it is much better if you have a plan before you start than if you try to retrofit it later (that trick never works).

The magic bullet for multi-threading is immutable data. Immutable data is inherently thread-safe. It doesn’t need synchronization or locks. If all your data are immutable, you can pretty much ignore multi-threading issues and your code will just work.

Using a 2D array to represent a Tic-Tac-Toe board is the obvious thing that first comes to mind, but not only are arrays mutable, they virtually require mutation to be of any use. The Lisp tutorials I was looking at all used arrays to represent the board, none of them locked the board or used atomic operations to update it, and all had the potential for race conditions if two threads tried to update the board at the same time. Arrays are essentially inherently thread-unsafe.

I thought about alternative representations for the board. Different representations are more or less amenable for writing code that avoids mutation. I came up with a few ideas:

  • Use a 2d array, but copy it before each mutation. This is horribly inefficient, but it is simple.
  • Use a 1d array, again copying it before each mutation. This isn’t much different from the 2d array, but iterating over the cells in the board is simpler.
  • Keep a list of moves. Each move is a pair of player and position. To determine the state of the board, you iterate over the list of moves and apply them in order. This is a bit more complicated than the array representations, but it is inherently immutable. It also has the advantage that you can rewind the board to any prior position.
  • Encode the board as a pair of bitmaps, one for each player.
  • Encode the board as a single bitmap, with each cell represented by two bits.
  • There are only 39 ways to fill out a Tic-Tac-Toe grid, so you could represent the board as an integer.

Each one of these representations has pros and cons. I wrote up some sample code for each representation and I found that the representation had a large influence on the character of the code that used that representation. In other words, there wasn’t a single general Tic-Tac-Toe program that ended up being specialized to each representation, but rather there were six different Tic-Tac-Toe programs each derived from its own idiosyncratic representation.

In conclusion, it is a good idea to plan on using immutable data when you might be working with a multi-threaded system, and it is worth brainstorming several different representations of your immutable data rather than choosing the first one that comes to mind.

Saturday, June 1, 2024

Roll Your Own Syntax

Unlike most languages, Lisp represents its programs as data structures. A Lisp program is a set of nested lists. We can look at a Lisp program as a tree, with each nested list as a node in the tree. The first element of each list indicates the kind of node it is. For instance, a sublist beginning with LET binds local variables, a sublist beginning with IF is a conditional, and so on.

In most languages, it difficult or impossible to add new node types to the syntax tree. The syntax is wired into the language parser and if you even can add new syntax, you have to carefully modify the parser to recognize it. In Lisp, adding new node types is quite easy: you just mention them.

To give an example, suppose you wanted to add a new node to the syntax tree called COMMENT, which would have a string and a subexpression as components. You'd use it like this:

(comment "Avoid fencepost error" (+ x 1))

Here's how you could define the semantics of a COMMENT node in Lisp:

(defmacro comment (string expr)
  expr)

That's it. You can now insert arbitrary COMMENT nodes into your Lisp programs.

Compare that to what you would have to do in a language like Java to add a new kind of node to the syntax tree. You'd have to modify the parser, the lexer, the AST classes, and probably a bunch of other stuff. It's non trivial.

For a more complex example, consider adding transactions to a language. In a language like Java, you'd have to modify the parser to recognize the new syntax, and then you'd have to modify the compiler to generate the correct code. In Lisp, you can just define a new node type:

(defmacro with-transaction (body)
  <complex macro elided>)

And then use it like this:

(with-transaction
  (do-something)
  (do-something-else))

Now obviously you should put some thought into doing this. Adding dozens of random new node types to the language would be a bad idea: readers of the code wouldn't be expecting them. But in some cases, a new node type can be just what is called for to abstract out complexity or boilerplate. Lisp gives you that option.

Tuesday, May 28, 2024

If I Were in Charge

If I were in charge of Python development, here are a few things I would do:

  • Add (optional) tail recursion. This would make it easier to write pure functional code. It would also make it possible to effectively program in continuation passing style. Making tail recursion optional should placate those that feel that stack traces are important for debugging.
  • Add macros. I am thinking of Lisp-like macros that do code transformation, not C-like macros that simply do token substitution. A good macro system would allow advanced users to create new syntactic forms for the language and provide a way to abstract boilerplate.
  • Allow a way to use statements inside expressions, or beef up the expression syntax to have exception expressions, loop expressions, etc. This, too, would make it easier to write pure functional code.
  • Optional end-of-block markers. These would allow you to automatically fix indentation errors and recover indentation when it is lost.
  • Use true lexical scoping. Changing this might break legacy code that depends on the current scoping quirks, though.
  • Use modern interpretation techniques to get the performance up to a more reasonable level. Performance doesn't matter that much, but Python is notably slow.
  • Get rid of the global interpreter lock so that multithreading works better. Probably easier said than done.

I don't believe any of these necessarily involve fundamental changes to the language. They'd just make the language more flexible, though I'm sure many people would disagree with me.

But, perhaps for the best, they're not going to put me in charge.

Sunday, May 26, 2024

Exception Handling for Control Flow

Back when I was taking a Software Engineering course we used a language called CLU. CLU was an early object-oriented language. A feature of CLU was that if you wrote your code correctly, the compiler could enforce completely opaque abstract data types. A good chunk of your grade depended on whether you were able to follow insructions and write your data types so that they were opaque.

CLU did not have polymorphism, but it did have discriminated type unions. You could fake simple polymorphism by using a discriminated type union as the representation of an opaque type. Methods on the opaque type would have to dispatch on the union subtype. Now to implement this correctly, you should write your methods to check the union subtype even if you “know” what it is. The course instructors looked specifically for this and would deduct from your grade if you didn't check.

One subproject in the course was a simple symbolic math system. You could put expressions in at the REPL, substitute values, and differentiate them. The core of the implementation was term data type that represented a node in an expression tree. A term was a type union of numeric constant, a symbolic variable, a unary expression, or a binary expression. Unary and binary expressions recursively contained subterms.

The term data type would therefore need four predicates to determine what kind of term it was. It would also need methods to extract the constant value if it was a constant, extract the variable name if it was symbolic, and extract the operator and subterms if it was a unary or binary expression. The way to use the data type was obvious: you'd have a four-way branch on the kind of term where each branch would immediately call the appropriate selectors. But if you wrote your selectors as you were supposed to, the first thing they should do is check that they are called on the appropriate union subtype. This is a redundant check because you just checked this in the four-way branch. So your code was constantly double checking the data. Furthermore, it has to handle the case should the check fail, even though the check obviously can never fail.

I realized that what I needed was a discrimination function that had four continuations, one for each subtype. The discrimination function would check the subtype and then call the appropriate continuation with the subcomponents of the term. This would eliminate the double checking. The code was still type safe because the discrimination function would only invoke the continuation for the correct subtype.

One way to introduce alternative continuations into the control flow is through exceptions. The discrimination function could raise one of four exceptions, each with the appropriate subcomponents as arguments. The caller would not expect a return value, but would have four catch handlers, one for each subtype. The caller would use the try/except syntax, but it would act like a switch statement.

The TA balked at this use of exceptions, but I appealed to the professor who saw what I was trying to do and approved.

There is some debate about whether exceptions should be used for control flow. Many people think that exceptions should only be used for “exceptional” situations and that it is poor form to use them for normal control flow. I think they are taking too narrow a view. Exceptions are a way to introduce alternative paths of control flow. One can use them to, for instance, handle an exceptional situation by jumping to a handler, but that's not the only way to use them.

Of course you should think hard about whether exceptions are the right way to introduce alternative control flow in your use case. Exception syntax is usually kind of klunky and you may need to rewrite the code to insert the exception handling at the right point. Readers of your code are likely to be surprised or baffled by the use of exceptions for control flow. There is often a significant performance penalty if an exception is thrown. But sometimes it is just the trick.

Saturday, May 25, 2024

ECS in CLOS

Object systems make a natural fit for the game programming domain, but over time people have found that the object-oriented model provided by their favorite language doesn't always fit the use case. So developers have come up with “entity-component systems” (ECS) of varying complexity to fill the gap.

The basic idea is that a game object, called an “entity”, contains or refers to a set of “components” that define its behavior. Entities are too varied to be captured by a single class hierarchy, so we abandon inheritance in favor of composition. An entity that can be displayed on the screen has a “sprite” component, an entity that can move has “position” and “velocity” components, an entity that you can attack has a “hitbox” component and a “health” component. A entity that can attack you has an “attackbox” component. You can play mix and match the components to customize the behavior of the entity. We don't have a hierachy of components because each component can be added more or less independently of the others.

An ECS is an alternative or an augmentation to the built-in object system of a language, but it is an admission that the built-in object is insufficient.

CLOS provides an elegent way to implement an ECS without abandoning CLOS's built-in object system. We define a component as a “mixin” class that can be inherited from using multiple inheritance. We define mixin classes for each component, and then we define entity classes that inherit from the mixin classes. We would define a “sprite” mixin class, a “position” mixin class, etc. So the class of enemy entities would inherit from “sprite”, “position”, “velocity”, “hitbox”, “health”, and “attackbox” classes. A trap entity would inherit from “sprite”, “position”, and “attackbox” classes. A container entity that you could smash open would inherit from “sprite”, “position”, and “hitbox” classes. Etc.

Mixin classes aren't intended to be instantiated on their own, but instead provide slots to the classes that inherits from them. Furthermore, methods can be specialized on mixin classes so that instances derived from the mixin will respond to the method. This allows you to inherit from a mixin providing the particular desired behavior. For example, the “health” mixin class would provide a slot containing the entity's health and a “damage” method to decrease the health. Any entity inheriting from the “health” mixin will react to the damage method. Mixin classes can provide functionality similar to interfaces.

Mixin classes are an exception to Liskov's Substitution Principle, but they are a useful exception. Entities that inherit from a mixin do not have a “is-a” relationship with the mixin, but rather a “has-behavior-of” relationship. An entity inheriting from the “health” mixin is not a “health”, but it has the “health” behavior.

One feature of an ECS is that you can dynamically change the components of an object. For example, once an enemy is defeated, you can remove the “health” and “attackbox” components and add a “corpse” component. Is CLOS you could accomplish this by changing the class of the entity object to a class that doesn't have those mixins.

Of course care must be taken or you will end up with CLOS “soup”. But if you are careful, CLOS can provide a powerful and flexible system for implementing an ECS.

Friday, May 24, 2024

Why I Want Tail Recursion

The reason I want tail recursion is not to write loops (I can do that with a while loop), but to write in continuation passing style if I need to. Continuation passing style allows you to implement any control flow pattern you can imagine, not just the ones intrinsic to the language. You don’t want to use it all the time, but it’s a valuable fallback when you need some ad hoc advanced control flow. Without tail recursion, any non-trivial use of continuation passing style risks blowing the stack.

Iteration is just the special case of linear recursion that doesn’t accumulate state. 99 percent of the time, you know beforehand that you are looping and can use a looping construct, but sometimes you have the general case where whether you loop or not depends on the data at runtime. If you have tail recursion, you just write the code recursively and the tail recursion mechanism will turn it into a loop if it notices that you aren’t accumulating state.

If your language has lambda and tail recursion, it can implement any other control flow that might have been overlooked by the language designer. If it doesn’t, you're limited to the control flow the language designer bothered to implement.

Thursday, May 23, 2024

Rewrite rules

I like rewrite rules. They are simple to understand and easy to implement. If you have a rewrite rule (or a set of them) that makes incremental progress in making an expression "simpler" in some sense, you can keep applying it (or them) repeatedly until you have finished the calculation.

If a function directly returns the result of calling another function, we say that the call is in "tail position". We can view this as a simple rewrite rule. Consider the fact-mul function which computes the factorial of a number n and multiplies it by another number m:

fact-mul (n, m) = n! * m

We can define two rewrite rules for this function. If n is 0, then the factorial is 1, so we just rewrite this as m. If n is not 0, we can rewrite this as fact-mul (n-1, n*m). This translates directly into Lisp:

(defun fact-mul (n m)
  (if (zerop n)
      m
     (fact-mul (- n 1) (* n m))))

The rewrite rule doesn't have to rewrite into the same function. For instance, we can rewrite odd? (x) to even? (x-1) and even? (x) to odd? (x-1):

(defun odd? (x)
  (if (zerop x)
      nil
      (even? (1- x))))

(defun even? (x)
  (if (zerop x)
      t  
      (odd? (1- x))))

Yes, this is bog-simple tail recursion, but somehow it seems even simpler to me if I think of it as a rewrite rule.

Even if you don't have tail recursion, or if you have disabled it, it still works, but the number of rewrites is limited by the stack depth.

Tuesday, May 21, 2024

If only ...

I have a problem with Lisp. It doesn’t do what I want. Whenever I write a Lisp program, I always find myself wanting new special forms that aren’t in the language. The object system almost never does exactly what I want. The control structures never cover all my use cases.

If only I could add new syntactic forms to the language. If only I could tweak the internals of the object system. If only I could add new kinds of control structure.

Oh, wait. I can.

Monday, May 20, 2024

Maybe machine (part 2)

At the end of 6.004, we had a contest for improving the performance of the Maybe machine. There was an 8-bit ALU included in the chip set we were given, so the logical thing to do was to incorporate it into the machine in place of the shift register. This would naturally improve the performance of arithmetic operations, but there were other reasons that the Maybe machine was slow.

One of the rules of the contest was that we were not allowed to change the clock speed. But the Maybe machine had a four phase clock. The first phase would address the device driving the bus, the second phase would connect it to the bus, the third phase would clock the receiving device, and the fourth phase would keep the data asserted to obey the hold time on the logic.

The four phase clock was technically unnecessary. We had built it in this manner to illustrate set-up and hold times for the bus, but the TTL chips we were using were specifically designed to be used on an open collector bus with no hold time, so you could actually clock the bus on every other phase rather than divide by four and still be within valid hardware specs. The fundamental clock speed was not changed, but the machine would run twice as fast.

The microcode word on the Maybe machine had four parts. The first part loaded the shift register from memory. The second part operated on the shift register. The third part wrote the shift register back to memory, and the fourth part was the next microcode address. But much of a typical microinstruction was wasted time. More often than not, the shift register was loaded from the last location it was written to, so it already contained the data it needed. The load cycle was often unnecessary. If you just left the data in the shift register, it would already be there for the next instruction. A memory to memory move involved a no-op on the shift register, which took clock cycles. Advancing to the next instruction took more clock cycles.

Instead of a four part microinstruction, I made a one-part microinstruction. True, this meant that I would occasionally need more microinstructions, but each one was smaller, and I could omit the no-ops, so I could accomplish the same amount in fewer clock cycles. I eliminated the next-instruction field by replacing the microcode address register with a counter.

The original Maybe machine had a two-address microcode — each instruction would specify a separate load and store address. Most people retained the wide microcode and could specify each input to the ALU independently. But I wanted to squeeze each microinstruction into 16 bits, so I turned the machine into a one-address machine with an accumulator. This made my microcode one quarter of the size, but it made it difficult to find enough bits in the microcode to accomplish all the tasks it needed to do. I didn’t have enough bits to directly address the accumulator, so I attached the accumulator to the output of the ALU. One ALU input came from the bus, the other was fed back from the accumulator. To load the accumulator, you had to pass the data through the ALU.

There weren’t enough bits for jump instructions, but I did a hack: I incremented the instruction address half way through the instruction cycle. This way, you could read the bits of the next instruction while you were still executing the current instruction. So you could do an unconditional jump by placing the target in the next instruction. There still weren’t enough bits for a conditional jump, so the microcode address space ended up being partitioned into sixteen segments and you could only conditionally jump within the same segment. To jump between segments you had to use an unconditional jump.

I obviously had to do a complete rewrite of the microcode. My friend Bob Baldwin helped me out by writing an assembler for the new microinstruction format.

The new microcode ran substantially faster than the original. When the contest came, my machine was clearly faster than the competition. Unfortunately, it crashed on the third benchmark and I ended up disqualified. I think it was a bug involving switching between the segments of microcode, but I never had the opportunity to debug it fully. I learned a hard lesson about keeping things simple.

Maybe machine

When I took 6.004, Computation Structures, we built the “Maybe” machine, a pedagogical microcomputer made from discrete TTL. It had no CPU. It had no ALU. It had a shift register. The microcode could load the shift register from memory, test the low bit, shift the register, and store the register to memory. That was it. It was universal (Turing complete), but it was primitive. If I recall correctly, Steve Ward, Chris Terman, and Dan Nussbaum designed it.

We each built our own Maybe machine on a “nerd kit” — a large briefcase/small suitcase with a power supply, some solderless breadboards, some switches and LEDs, and a bunch of TTL chips. We had to wire up the chips ourselves. I was maybe a bit anal about wiring it up. I color coded the wires, made neat right angle bends, and ran the wires in parallel in the troughs between the breadboards. But when it came to testing the gates, this made it easier to find the wires that had ended up in the wrong place.

When we were building it, we had a test PROM with microcode that would just exercise the logic gates. It had no functional program in it. If the gates were wired correctly, certain patterns would appear on the LEDs.

At last I had it all wired up. Each gate responded appropriately to the test PROM. I double checked to make sure I ran each and every test. I was ready to try the real microcode now. I put my EEPROM under the UV light to erase it, and then programmed it with the microcode. I verified that the PROM contained the correct bits. I popped the PROM into the socket, and powered up the machine. Nothing. The machine did not display the expected pattern of LEDS. It didn’t display anything. Crap.

I erased the EEPROM and reprogrammed it as a test PROM again. I ran all the tests to find which one was failing. All passed. I ran them again. Everything checked out. I started to think maybe a loose wire? I ran the tests again and wiggled the wires. The tests still passed.

Confused, I erased the EEPROM and reprogrammed it with the microcode. I put it back in the Maybe machine and powered it up. Nothing. I was stumped. I had no idea what was wrong. I tried wiggling the wires. I tried wiggling the chips. Nothing.

I reprogrammed the test PROM. Each reprogramming of the test PROM took at least 15 minutes as I waited for the UV light to erase the old program and then wrote in the new one. I verified one last time that all the tests ran correctly and went to consult the TA.

The TA had heard it all before. Some freshman can’t follow instructions, wires it up wrong, and is sloppy about running the tests. He was going to demonstrate the right way to thoroughly test the machine. He had a checklist of each and every test. He was going to exhaustively go through the checklist and tally each one as he went. He was going to show me how to do it right.

He was a little surprised that he didn’t find the problem right away. As each test passed, be became more and more puzzled. Finally, he got to the end of his checklist and every test had passed. So he programmed the microcode into the EEPROM and powered up the machine. Nothing.

Ok, it was a challege. The machine made him look foolish in front of this lower classman. He was going to find the problem. He reprogrammed the test PROM and ran the tests again. As he ran each test, he’d wiggle the wiring and the chips involved in the functional unit being tested. Everything was in order. But when he put in the microcode ROM, everything was dead.

We went through a couple of cycles of this, but it was getting to be dinner time so we decided to call it a night. I could only think that I was going to have to rewire the whole thing from scratch. I was dreading it. I headed back to my dorm room.

But...

It was odd that when you tried to run the microcode, although you didn’t get the expected pattern of LEDs, they did briefly flicker when you powered up the machine. The instruction after loading the LEDs was a conditional branch to itself. The LEDs would be loaded and then stay on until the conditional became true when the user pressed a button. It was almost as if the conditional was taken immediately rather than waiting for the button press. Maybe the conditional was backwards? If the conditional were backwards, the LEDs would flicker because they’d be loaded and the mahine would immediately branch.

When I got to my dorm room, I opened the nerd kit and checked. Sure enough, the conditional was wired to the inverted output of a flip-flop. Every conditional branch would be taken the wrong way. The non-inverted output was the next hole over. The test PROM checked that the wire was connected, but didn’t check that it was connected to the positive sense of the conditional bit.

The next day, I went back to the lab with the fix. I reprogrammed the EEPROM with the microcode and powered up the machine. The LEDs lit up in the expected pattern and the machine ran as expected.

Thursday, May 16, 2024

Readability at Google

Over the years I was a developer at Google I never once tried to pass “readability” for any language. If you didn’t have “readability”, then any code you checked in would need an additionally review by someone who did have “readability”. The theory was that the code base would turn into an unreadable patchwork of different styles if everyone didn't adhere to the same set of rules. By having a “readability” requirement, the code base would be uniform and understandable. In practice, however, it was a social signal that indicated whether you were a card carrying member of the “in” group or not. Having “readability” meant you were a nobleman. Not having it meant you were a commoner.

To get “readability” you had to endure a struggle session with a current holder of “readability”. You would submit a not insubstantial amount of code for review and the reviewer would go over it with a fine-tooth comb pointing out every little syntactic rule you broke. It was a game of gotcha designed to make the reviewer feel smarter than you. You’d then return to your office or cubicle, make changes and iterate. If you were lucky, you would only need two or three iterations before you were bestowed with “readability”. The point was not to demonstrate that you knew how to write readable code, but to demonstrate that you were willing to submit to the authority of the reviewer and follow a set of arbitrary rules, no matter how absurd. It was a hazing ritual, a way to show that you were willing to be a team player.

A problem with having “readability” was that you were expected to uphold the “readability” guidelines. Not only did you have to play the game, you had to perpetuate it. I disagreed with a lot of the guidelines because they were poorly thought out rules for the sake of having rules. There were many situations in which they would reduce the comprehensibility of the code. By not having “readability”, I was free to write code that was easier to understand and maintain, even if bent or broke the codified rules.

This isn’t to say that I ignored Google’s style. It is hard to understand code that switches styles every few lines. I tried to make my code fit in to the surrounding code and look like it was part of it. I wrote new files with same general layout, curly brace conventions, and indentation that a typical Google file has. I didn’t worry about following “readability” guidelines at all, but I didn’t flout the guidelines either. I more concerned about my code being comprehended by the next developer than following a set of ad hoc rules to the letter.

So I got used to submitting my code for “readability” review, in addition to the normal code review. I off-loaded the entire responsibility of “readability” to the reviewer and let them tell me what to do. I didn’t argue with them. I simply followed the directions of the reviewer and made whatever changes were suggested. Often, though, changes to the code to more strictly follow the guidelines would result in noticeably worse code and the reviewers would soon withdraw their suggestions.

Another problem of being a nobleman with “readability” is that you had the obligation to review the code written by the commoners. The people with “readability” was a small enough group that they soon became the bottleneck in the review process. They were constantly overworked with code reviews. We commoners, on the other hand, weren’t expected to review code. Nor were we to blame if someone else is holding up the code review.

Frankly, I didn’t see any upside of having “readability” and many downsides. I was able to get my code checked in without too much trouble and I just didn’t have the time or inclination to get involved in the high-school politics of “readability”. Plenty of other companies do not have a “readability” ritual. They seem to get along just fine by hiring people who can write code that is easy to understand and maintain.

Friday, May 10, 2024

The Way of Lisp or The Right Thing

Interpreting Richard Gabriel with a nod to Tim Peters

Keep it simple.
A simple interface is better than a simple implementation.
Complete is better than simple.
Consistent is better than complete.
Correctness trumps all. Incorrectness is simply not allowed.
If the interface is hard to use, it is probably a bad idea.
If the interface is easy to use, it may be a good idea.
A weak language makes it hard to write good code, but a powerful language makes it easy to write bad code that looks good.
The first thing that comes to mind is not always the best thing.
When performance matters, get your declarations right.
There is always another way to do it, and maybe a better one.
Lists aren’t the only data type.
The right thing and 2 shillings will get you a cup of tea.

Monday, May 6, 2024

Motion without Integration

A game where things move around has equations of motion that describe how objects move. We need to solve these to produce the locations of the objects as a function of time, and we have to do it in lock step real time. Fortunately, we usually don’t need highly realistic models or highly accurate results. Game phyiscs, while inspired by real world physics, can be crude approximations.

If the velocity of an object changes over time, we need to integrate the velocity to determine the position. This is typically done with forward Euler integration: the current velocity is multiplied by a small Δt and this is added to the current position. In other words, we assume the object moves linearly over the amount of time represented by Δt. If Δt is small enough, the linear segments of motion will approximate the actual curve of motion.

In a game, we use the game loop, which runs every few milliseconds, as our source of time. We subtract the current time from the previous time the game loop was run to obtain our Δt. We update each object’s state using forward Euler integration over Δt.

Our game loop keeps track of the last time it was run and subtracts that from the current time. It runs entity-step! on each entity in the game, passing in dticks.

(let ((start-tick (sdl2:get-ticks)))
  (let ((last-tick start-tick))
    (loop
      (let* ((tick (- (sdl2:get-ticks) start-tick))
             (dticks (- tick last-tick)))
        (map nil (lambda (entity)
                   (entity-step! entity dticks))
                 entities)
        (setq last-tick tick)))))

entity-step! performs our forward Euler integration:

(defun entity-step! (entity dticks)
  (incf (position entity) (* (velocity entity) dticks)))

Assuming the game loop runs reasonably frequently, the position of the entity will be a reasonable approximation of the integral of the velocity. If the game loop gets delayed for some reason, e.g. garbage collection, the corresponding increase in the value of dticks will make up for it.

But sometimes we have the fortune of having a closed form solution to the time integral. If this is the case, we can compute the entity’s state directly and we don’t need to integrate the state changes on every iteration of the game loop. Let me give two examples.

In the platformer game, there are potions that the player can take to restore his health. A potion appears as a vial that floats above the ground. A “bobbing” effect is achieved by changing the height above the ground in a time varying manner. Now we could implement this by having the game loop modify the y position of the potion on each iteration, but there is a closed form solution. We can compute the y position at any time by adding a constant base y position to a factor of the sine of the current time. This is easily done by specializing the get-y method on potions:

;;; Make potions bob up and down.
(defmethod get-y ((object potion))
  (+ (call-next-method)
     (* 5 (sin (* 2 pi (/ (sdl2:get-ticks) 1000))))))

call-next-method invokes the next most specific method — in this case, the get-y method of the entity class, which returns the base y position. We add a sine wave based on the current time.

This is only thing we need to modify to make a potion bob up and down. Values dependent on the y position, such as the attackbox, will bob up and down with the potion because it uses the get-y method to determine what the potion is.

A second example is cannonball motion. Cannonballs travel linearly away from the cannon at a constant velocity. Rather than stepping the cannonball by adding its velocity to its position on each update, we specialize the get-x method on cannonballs

(defmethod get-x ((cannonball cannonball))
  (+ (call-next-method)
     (* (get-speed cannonball)
        (- (sdl2:get-ticks) (get-start-tick cannonball)))))

If we specialize both the get-x and get-y methods we can easily get objects to trace out any desired parametric curve, like a Lissajous figure.

Saturday, May 4, 2024

Animation: Two Methods

Animated sprites are much cooler that static ones and really make your game come to life. They are pretty easy to implement, but there are a couple of ways to implement them, and I've seen several projects use the less optimal strategy.

The less optimal strategy is straightforward. We keep a time counter in the animated entity and increment it every time we update the entity. When the counter exceeds the amount of time for an animation frame, we advance to the next frame and reset the ticker.

(defclass entity ()
  ((animation-ticker :initform 0 :accessor animation-ticker)))

(defun entity-step! (entity dticks)
  (incf (animation-ticker entity) dticks)
  (when (> (animation-ticker entity) ticks-per-frame)
    (advance-to-next-frame! entity)
    (decf (animation-ticker entity) ticks-per-frame)))

Now this works, but the reason it is less optimal is that it involves maintaining time varying state in the entity. Recall that the game has two primary threads running: a render thread which runs 60 times a second and draws the game on the screen, and an game loop thread which runs maybe 200 times a second and updates the state of the entities in the game. The point of this is to separate and abstract the drawing of the game from the running of the mechanics of the game. By using this less optimal method, we involve the game loop thread in maintaining state that is used only by the render thead. To anthromorphize, the entity “knows” that it is animated and is taking an active role in keeping things moving.

A better way is to put the responsibility of animation solely within the render thread by having it select which animation frame to display when it draws the sprite. The sprite contains a vector of animation frames and a timestamp at which the animation was started. When redrawing the sprite, the elapsed time is computed by subtracting the start timestamp from the current time, and the frame to display is computed by dividing the elapsed time by the time per frame modulo the length of the frame vector.

(defclass animation ()
  ((start-tick :initform (get-ticks) :reader start-tick)
   (frames :initarg frames :reader frames)))

(defun display-animation! (animation)
   (let* ((elapsed-time (- (get-ticks) (start-tick animation)))
          (absolute-frame (floor elapsed-time ticks-per-frame))
          (frame (mod absolute-frame (length (frames animation)))))
     (display-frame! (svref (frames animation) frame))))

This way of doing it no longer needs time varying state, and we’ve abstracted the display of the animation from the handling of the entity’s state. The entity no longer has to “know” that it is animated or do anything about it. There no longer has to be communication between the game loop thread and the render thread on every animation frame. This could be important if the render thread is running on a different machine than the game loop thread. If the game loop thread lags (maybe because of garbage collection or entering the error handler), we still get smooth animation.

Tuesday, April 30, 2024

Statements and Expressions

In some languages, like Lisp and OCaml, every language construct is an expression that returns a value. Other languages, like Java or Python, have two kinds of language constructs: expressions, which combine compositionally and which have return values, and statements, which combine sequentially and which have no return values and thus must operate by side effect. Having statements in your language needlessly makes things more complicated, but language designers seem to want to go much further and add complexity that just seems capricious.

You cannot usually use a statement in a context where an expression is expected because there is no return value. You can use an expression where a statement is expected by simply discarding the return value. This means there are two kinds of contexts. A sane designer would provide a way to switch between statement and expression contexts, but language designers typically omit these.

Language constructs, such as binding or iteration, must be provided as statements or expressions or both. Language designers seem to randomly decide which constructs are statements and which are expressions based on whims and ease of compiler implementation. If you need to use a construct, but aren’t in the right kind of context, you need to switch contexts, but without a way to do this, you may have to rewrite code. For example, if you have a subexpression that could raise an exception that you want to handle, you’ll have to rewrite the containing expression as a series of statements.

A sane language designer would use the same syntax for the same construct in both expression and statement form, but typically the construct will have a very different syntax. Consider sequential statements in C. They are terminated by semicolons. But sequential expressions are separated by commas. Conditional statements use if/else, but conditional expressions use ?:. When refactoring, you cannot simply move code between contexts, you have to change the syntax as well.

Writing a function adds a new expression to the language. But function bodies aren’t expressions, they are statements. This automatically destroys referential transparency because you cannot substitute the function body (statements) at the call site (expression context).

try/catch is a nightmare. Usually you only get try/catch as a statement, not an expression, so you cannot use exception handling in a subexpression. You cannot get a value out of a try/catch without a side effect. Throw, on the other hand, throws a value, so it could throw to an expression context, except that catch is a statement.

Having two kinds of language constructs and two different contexts in which you can use some of them but not others, and different syntaxes depending on the context just makes it that much more difficult to write programs. You have to keep track of which context is current and which syntax to use and constantly switch back and forth as you write a program. It would be easy for a compiler to introduce the necessary temporaries and rewrite the control flow so that you could use all language constructs in either context, but language designers don’t bother and leave it up to the programmer to do it manually.

And all this mess can be avoided by simply making everything be an expression that returns a value.

Thursday, April 25, 2024

State Machines

One of the things you do when writing a game is to write little state machines for objects that have non-trivial behaviors. A game loop runs frequently (dozens to hundreds of times a second) and iterates over all the state machines and advances each of them by one state. The state machines will appear to run in parallel with each other. However, there is no guarantee of what order the state machines are advanced, so care must be taken if a machine reads or modifies another machine’s state.

CLOS provides a particularly elegant way to code up a state machine. The generic function step! takes a state machine and its current state as arguments. We represent the state as a keyword. An eql specialized method for each state is written.

(defclass my-state-machine ()
  ((state :initarg :initial-state :accessor state)))

(defgeneric step! (state-machine state))

(defmethod step! ((machine my-state-machine) (state (eql :idle)))  
  (when (key-pressed?)
    (setf (state machine) :keydown)))

(defmethod step! ((machine my-state-machine) (state (eql :keydown)))
  (unless (key-pressed?)
    (setf (state machine) :idle)))

The state variables of the state machine would be held in other slots in the CLOS instance.

One advantage we find here is that we can write an :after method on (setf state) that is eql specialized on the new state. For instance, in a game the :after method could start a new animation for an object.

(defmethod (setf state) :after ((new-state (eql :idle)) (machine my-state-machine))
  (begin-idle-animation! my-state-machine))

Now the code that does the state transition no longer has to worry about managing the animations as well. They’ll be taken care of when we assign the new state.

Because we’re using CLOS dispatch, the state can be a class instance instead of a keyword. This allows us to create parameterized states. For example, we could have a delay-until state that contained a timestamp. The step! method would compare the current time to the timestamp and go to the next state only if the time has expired.

(defclass delay-until ()
  ((timestamp :initarg :timestamp :reader timestamp)))

(defmethod step! ((machine my-state-machine) (state delay-until))
  (when (> (get-universal-time) (timestamp state))
    (setf (state machine) :active)))

Variations

Each step! method will typically have some sort of conditional followed by an assignment of the state slot. Rather that having our state methods work by side effect, we could make them purely functional by having them return the next state of the machine. The game loop would perform the assignment:

(defun game-loop (game)
  (loop
    (dolist (machine (all-state-machines game))
      (setf (state machine) (step machine (state machine))))))

(defmethod step ((machine my-state-machine) (state (eql :idle)))  
  (if (key-pressed?)
      :keydown
      :idle))

I suppose you could have state machines that inherit from other state machines and override some of the state transition methods from the superclass, but I would avoid writing such CLOS spaghetti. For any object you’ll usually want exactly one state transition method per state. With one state transition method per state, we could dispense with the keyword and use the state transition function itself to represent the state.

(defun game-loop (game)
  (loop
    (dolist (machine (all-state-machines game))
      (setf (state machine) (funcall (state machine) machine)))))

(defun my-machine/state-idle (machine)
  (if (key-pressed?)
      (progn
         (incf (kestroke-count machine))
         #'my-machine/state-keydown)
      #'my-machine/state-idle))

(defun my-machine/state-keydown (machine)
  (if (key-pressed?)
      #'my-machine/state-keydown
      #'my-machine/state-idle))

The disadvantage of this doing it this way is that states are no longer keywords. They don’t print nicely or compare easily. An advantage of doing it this way is that we no longer have to do a CLOS generic function dispatch on each state transition. We directly call the state transition function.

The game-loop function can be seen as a multiplexed trampoline. It sits in a loop and calls what was returned from last time around the loop. The state transition function, by returning the next state transition function, is instructing the trampoline to make the call. Essentially, each state transition function is tail calling the next state via this trampoline.

State machines without side effects

The state transition function can be a pure function, but we can remove the side effect in game-loop as well.

We keep parallel lists of machines and their states (represented as state transition functions).

(defun game-loop (machines states)
  (game-loop machines (map 'list #'funcall states machines)))

Now we have state machines and a driver loop that are pure functional.

Friday, April 19, 2024

Plaformer Game Tutorial

I was suprised by the interest in the code I wrote for learning the platformer game. It wasn’t the best Lisp code. I just uploaded what I had.

But enough people were interested that I decided to give it a once over. At https://github.com/jrm-code-project/PlatformerTutorial I have a rewrite where each chapter of the tutorial has been broken off into a separate git branch. The code is much cleaner and several kludges and idioticies were removed (and I hope none added).

Monday, April 1, 2024

You May Not Need That :around Method

I’ve seen this “anti-pattern” a few times in CLOS code. A superclass ’super will have a subclass ’sub and there will be a primary method specialized to the superclass.

(defmethod foo ((instance super) arg)
  (format t "~&Foo called on ~s." arg))

Then I’ll see an :around method defined on the subclass:

(defmethod foo :around ((instance sub) arg)
  (format t "~&Start foo...~%")
  (call-next-method)
  (format t "~&End foo.~%"))
The intent here is clearly that code in the method specialized on the subclass is invoked “around” the call to the method specialized on the superclass.

But the :around qualifier is not necessary and probably doesn’t do what is intended. If we remove the :around qualifier, then the most specific primary method will be the foo method specialized on ’sub. And the (call-next-method) invokation will chain up to the foo method specialized on ’super. It will work as was likely intended.

:around methods are useful when the superclass wants to run a method “around” the subclass. :around methods are combined from least specific to most specific — the opposite order of primary methods — so that the superclass can wrap the call to the subclass. An good example of where an :around method would be handy is when you need to sieze a lock around the call to the method. The superclass would sieze the lock in an :around method that would run before any of the subclass primary methods ran.

Ordinary chaining of methods doesn’t need the :around qualifier. Just chain the methods.

Tuesday, March 26, 2024

With- vs. call-with-

In Common Lisp, there are a lot of macros that begin with the word “with-”. These typically wrap a body of code, and establish a context around the execution of the code.

In Scheme, they instead have a lot of functions that begin with the words “call-with-”. They typically take a thunk or receiver as an argument, and establish a context around a call to the thunk or receiver.

Both of these forms accomplish the same sort of thing: running some user supplied code within a context. The Scheme way accomplishes this without a macro, but “call-with-” functions are rarely used as arguments to higher order functions. Writing one as a function is slightly easier than writing one as a macro because the compiler takes care of avoiding variable capture. Writing one as a macro leaves it up to the implementor to use appropriate gensyms. Writing one as a macro avoids a closure and a function call, but so does inlining the function. The macro form is slightly more concise because it doesn’t have a lambda at every call site. The function form will likely be easier to debug because it will probably involve a frame on the stack.

There’s no need to commit to either. Just write a “with-” macro that expands into a call to an inline “call-with-” function. This should equally please and irritate everyone.

Thursday, March 21, 2024

Porting a Game from Java (update)

I didn’t expect anyone would be interested, so I just pushed the code that I had with little thought about anyone trying to use it. It turns out that some people actually wanted to run it, so I polished off some of the rough edges and made it easier to get working. Feel free to email me if you have questions or suggestions.

Tuesday, March 19, 2024

Porting a Game from Java

I decided to learn about games, so I followed along with a tutorial by Kaarin Gaming. His tutorial was written in Java, but of course I used Common Lisp. I made no attempt to faithfully replicate his design, but I followed it closely in some places, less so in others. The resulting program was more or less a port of the Java program to Common Lisp, so it not very remarkable in and of itself. Certainly I don’t expect many people to be interested in reading beyond this point.

It’s known that Java is wordy language and it shows in the end result. The tutorial had 3712 lines of Java code in 39 files and the equivalent Common Lisp was 2255 lines in 21 files. A typical Common Lisp file would contain more code than a typical Java file. It was often the case that a Common Lisp file would contain multiple classes.

Both versions used separate render and game mechanics threads. The render thread ran at about 60 frames per second where the game mechanics ran at 200 steps per second. The threads were mostly independent, but the game mechanics would occasionally have to query the render thread to find out whether an animation had completed or what frame the animation was on in order to synchronize attacks with reactions.

There were a couple of notable differences in the two implementations. The Java implementation would advance animation frames imperatively by incrementing the animation frame counter every few rendering cycles. The Common Lisp implementation would instead compute the animation frame functionally by subtracting the current time from the animation start time and dividing by the ticks per animation frame. In Common Lisp, different animation effects could be achieved by changing how the animation frame was computed. If you computed the frame number modulo the number of frames in the animation, you’d get an animation loop. If you clamped the frame number, you get a one-shot animation.

CLOS made some things easier. An :after method on the (setf get-state) of an entity would set the animation. The get-y method on some objects would (call-next-method) to get the actual y position and then add a time varying offset to make the object “float” in mid air. The get-x method on projectiles would would (call-next-method) to get the starting x position and then add a factor of the current ticks. This causes projectiles to travel uniformly horizontally across the screen. I often used the ability of CLOS to specialize on some argument other than the first one just to make the methods more readable.

The Common Lisp code is at http://github.com/jrm-code-project/Platformer, while the Java code is at https://github.com/KaarinGaming/PlatformerTutorial. Being a simple port of the Java code, the Common Lisp code is not exemplary, and since I didn’t know what I was doing, it is kludgy in places. The Common Lisp code would be improved by a rewrite, but I’m not going to. I was unable to find a sound library for Common Lisp that could play .wav files without a noticable delay, so adding sound effects to the game is sort of a non-starter. I think I’ve gotten what I can out of this exercise, so I’ll likely abandon it now.

Saturday, February 10, 2024

Bilinear Fractional Transformations

A bilinear fractional transformation (BiLFT) is a two-argument function of the form

(lambda (x y)
  (/ (+ (* A x y) (* B x) (* C y) D)
     (+ (* E x y) (* F x) (* G y) H)))
William Gosper figured out how to use BiLFTs to perform linear fractional operations on infinite compositions of LFTs. It isn’t as hard as it might seem.

BiLFTs do not form a group under functional composition with LFTs, but they have this interesting property: if you hold either input to the BiLFT constant, the BiLFT becomes a LFT on the other input. So if we consider just one of the inputs at a time (or the output), we can apply some group-like operations.

You can compose LFTs with BiLFTs in three ways. First, you can run the output of a BiLFT into the input of a LFT. Second and third, you can run the output of a LFT into the either the x or y input of a BiLFT. Composing a LFT with a BiLFT in any of these ways produces another BiLFT, so there is some notion of closure under composition. Composing the identity LFT with a BiLFT in any of these ways does not change anything. For any composition of a LFT with a BiLFT, you can compose the inverse of the LFT to undo the composition.

BiLFTs can also form infinite compositions, but since there are two inputs, each input can be infinite. We cannot use a Scheme stream, but we can create a two-tailed stream-like object called a binary-expression. A binary-expression is a composition of two infinite compositions. We represent a binary-expression as an object with two delayed cdrs.

(defclass binary-expression ()
  ((generation :initarg :generation
               :initform 0)
   (bilft :initarg :bilft
          :initform (error "Required initarg :bilft was omitted."))
   (delayed-left :initarg :delayed-left
                 :initform (error "Required initarg :delayed-left was omitted."))
   (delayed-right :initarg :delayed-right
                  :initform (error "Required initarg :delayed-right was omitted."))))

Like a LFT, we use binary-expressions by trying to operate on the BiLFT, but forcing one of the tails and refining the binary-expression when necessary.

(defun refine-left (binary-expression)
  (let ((delayed-left (slot-value binary-expression 'delayed-left)))
    (if (null delayed-left)
        (error "Attempt to refine-left on empty stream.")
        (let ((left (force delayed-left)))
          (make-instance 'binary-expression
                         :generation (+ (slot-value binary-expression 'generation) 1)
                         :bilft (compose-bilft-lft-x
                                 (slot-value binary-expression 'bilft)
                                 (if (empty-stream? left)
                                     (make-lft 1 1
                                               0 0)
                                     (stream-car left)))
                         :delayed-left (if (empty-stream? left)
                                           nil
                                           (stream-delayed-cdr left))
                         :delayed-right (slot-value binary-expression 'delayed-right))))))

(defun refine-right (binary-expression)
  (let ((delayed-right (slot-value binary-expression 'delayed-right)))
    (if (null delayed-right)
        (error "Attempt to refine-right on empty stream.")
        (let ((right (force delayed-right)))
          (make-instance 'binary-expression
                         :generation (+ (slot-value binary-expression 'generation) 1)
                         :bilft (compose-bilft-lft-y
                                 (slot-value binary-expression 'bilft)
                                 (if (empty-stream? right)
                                     (make-lft 1 1
                                               0 0)
                                     (stream-car right)))
                         :delayed-left (slot-value binary-expression 'delayed-left)
                         :delayed-right (if (empty-stream? right)
                                            nil
                                            (stream-delayed-cdr right)))))))

refine-fairly alternates forcing the delayed-left or delayed-right tail while refine-disjoint looks at the overlap in the ranges of the BiLFT.


(defun refine-fairly (binary-expression)
  (if (zerop (mod (slot-value binary-expression 'generation) 2))
      (if (null (slot-value binary-expression 'delayed-left))
          (refine-right binary-expression)
          (refine-left binary-expression))
      (if (null (slot-value binary-expression 'delayed-right))
          (refine-left binary-expression)
          (refine-right binary-expression))))

(defun refine-disjoint (binary-expression)
  (if (bilft-disjoint? (slot-value binary-expression 'bilft))
      (refine-right binary-expression)
      (refine-left binary-expression)))

You can do arithmetic on infinite compositions by using the appropriate BiLFT:

(defparameter bilft-add 
  (make-bilft 0 1 1 0
              0 0 0 1))

(defparameter bilft-subtract 
  (make-bilft 0 1 -1 0
              0 0 0 1))

(defparameter bilft-multiply 
  (make-bilft 1 0 0 0
              0 0 0 1))

(defparameter bilft-divide 
  (make-bilft 0 1 0 0
              0 0 1 0))

Converting a binary-expression to a LFT stream

binary-expressions can be operated on in an analagous way to an infinite composition of LFTs, but in order to nest binary expressions, we need to be able to turn one into an infinite composition of LFTs so it can become the left or right input of the other. The way to do this is to generate a stream of LFTs and their inverses. The inverses are composed into the output of the binary-expression while the LFTs are composed downstream into one of the inputs of the next binary-expression.

The math works such that we can select any LFT that has an inverse, but we want our binary-expression to represent a narrowing transformation, so we try composing a few different LFTs and their inverses and see if we still have a narrowing transformation. If we find one, we proceed with the computation, but if we cannot find a LFT and its inverse that preserves the narrowing, we refine the binary-expression by forcing one of the two delayed inputs and composing it.

(defun decompose (left composed)
  (values left (funcall (inverse-lft left) composed)))

(defun decompose-range? (left composed if-success if-failure)
  (multiple-value-bind (left right) (decompose left composed)
    (if (range? right)  ;; does it still narrow?
        (funcall if-success left right)
        (funcall if-failure))))

(defun try-decompose-digit (lft if-success if-failure)
  (decompose-range?
   (make-lft 2 1 0 1) lft
   if-success
   (lambda ()
     (decompose-range?
      (make-lft 1 0 1 2) lft
      if-success
      (lambda ()
        (decompose-range?
         (make-lft 3 1 1 3) lft
         if-success
         if-failure))))))
    
(defun binary-expression-decompose-digit (binary-expression)
  (try-decompose-digit
   (slot-value binary-expression 'bilft)
   (lambda (digit bilft)
     (values digit (make-instance 'binary-expression
                                  :generation (slot-value binary-expression 'generation)
                                  :bilft bilft
                                  :delayed-left (slot-value binary-expression 'delayed-left)
                                  :delayed-right (slot-value binary-expression 'delayed-right))))
   (lambda ()
     (binary-expression-decompose-digit (refine-disjoint binary-expression)))))

(defun binary-expression->lft-stream (binary-expression)
  (multiple-value-bind (digit remainder) (binary-expression-decompose-digit binary-expression)
    (cons-lft-stream digit 
                     (binary-expression->lft-stream remainder))))

Now we have the machinery we need to do arithmetic on pairs of LFT streams.

Infinite Expression Trees

You can only go so far with a finite number of arithmetic operations, but you can go much further with an infinite number. For example, you can create converging infinite series. We can recursively generate an infinite tree of binary expressions. We unfold the tree and call a generating function at each recursive level.

(defun unfold-expression-tree-1 (left generate counter)
  (funcall (funcall generate counter)
           left
           (delay-lft-stream (unfold-expression-tree-1 left generate (+ counter 1)))))
Often the root of the tree is special, so the usual way we call this is
(defun unfold-expression-tree (root-bilft left generate)
  (funcall root-bilft
           left
           (delay-lft-stream (unfold-expression-tree-1 left generate 1))))

Square Root of Infinite Composition

To compute the square root of an infinite composition, we create an infinite tree. Each level of the tree refines the estimate of the square root of the estimate from the next level down in the tree.

(defun sqrt-lft-stream (lft-stream)
  (unfold-expression-tree-1
    lft-stream
    (constantly (make-bilft 1 2 1 0
                            0 1 2 1))
    0))
but we need not construct an infinite tree if each level is identical. We just re-use the level:
(defun sqrt-lft-stream (lft-stream)
  (let ((stream nil))
    (setq stream
          (funcall (make-bilft 1 2 1 0
                               0 1 2 1)
                   lft-stream
                   (delay-lft-stream stream)))
    stream))
This feeds back the square root into its own computation. (These ideas are from Peter Potts and Reinhold Heckmann.)

ex and log(x)

Peter Potts gives these formulas for ex and log(x) where x is an infinite composition of LFTs:

(defun %exp-lft-stream (lft-stream)
  (unfold-expression-tree-1
   (lambda (n)
       (make-bilft (+ (* 2 n) 2) (+ (* 2 n) 1) (* 2 n) (+ (* 2 n) 1)
                   (+ (* 2 n) 1) (* 2 n) (+ (* 2 n) 1) (+ (* 2 n) 2)))
   0
   (funcall (inverse-lft (make-lft 1 -1 1 1)) lft-stream)))

(defun %log-lft-stream (lft-stream)
  (unfold-expression-tree
   (make-bilft 1 1 -1 -1
               0 1  1  0)
   (lambda (n)
     (make-bilft n (+ (* n 2) 1)       (+ n 1) 0
                 0       (+ n 1) (+ (* n 2) 1) n))
   lft-stream))
These infinite expression trees converge only on a limited range, so we need to use identities on the arguments and results to extend the range to cover a wide set of inputs.

Sunday, February 4, 2024

Infinite Composition

If you like functional composition, you’ll like linear fractional transformations (LFTs). They behave very nicely when you compose them (in fact, they are a group under functional composition). If you compose two LFTs, you get another LFT. You can keep composing as long as you wish. Let’s compose an infinite number of LFTs.

I suppose I should argue that composing an infinite number of LFTs will give you anything at all. Consider those LFTs with non-negative coefficients. When given an input in the range [0,∞], they will output a number in the range [A/C, B/D]. The output range is narrower than, and contained within, the input range. If we compose two such LFTs, the range of output of the outermost LFT will be even narrower. As we compose more and more of these narrowing LFTs, the range of output of the outermost LFT will become narrower and narrower. If we compose an infinite number of narrowing LFTs, the output range will narrow to a single point. Curiously, the limits on the range of output of a finite composition of LFTs are always rational numbers, yet the output from infinite composition of LFTs can be an irrational real number.

There are many ways to represent an infinite number of LFTs. A generator or a co-routine, for example, but I like the stream abstraction from Scheme. This is easily implemented in Common Lisp. A delay macro creates promises:

(defclass promise ()
  ((forced? :initform nil)
   (values-or-thunk :initarg :thunk
                    :initform (error "Required initarg :thunk omitted.")))
  (:documentation "A simple call-by-need thunk."))

(defmacro delay (expression)
  “Delays evaluation of an expression and returns a promise.”
  ‘(make-instance ’promise :thunk (lambda () ,expression)))
and the force function forces evaluation:
(defun force (promise)
  “Returns the values of a promise, forcing it if necessary.”
  (check-type promise promise)
  ;; Ensure the values have been memoized.
  (unless (slot-value promise ’forced?)
    (let ((values (multiple-value-list (funcall (slot-value promise ’values-or-thunk)))))
      ;; If this is not a recursive call, memoize the result.
      ;; If this is a recursive call, the result is discarded.
      (unless (slot-value promise ’forced?)
        (setf (slot-value promise ’values-or-thunk) values)
        (setf (slot-value promise ’forced?) t))))
   ;; Return the memoized values.
   (values-list (slot-value promise ’values-or-thunk)))
A stream is a cons of an element and a promise to produce the rest of the stream:
(defclass lft-stream ()
  ((car :initarg :car
        :initform (error "Required initarg :car omitted.")
        :reader stream-car)
   (delayed-cdr :initarg :delayed-cdr
                :initform (error "Required initarg :delayed-cdr omitted.")
                :reader stream-delayed-cdr)))

(defmacro cons-lft-stream (car cdr)
  ‘(make-instance ’stream :car ,car :delayed-cdr (delay ,cdr)))

(defun stream-cdr (stream)
  (force (stream-delayed-cdr stream)))

A stream of LFTs will represent an infinite composition. The first LFT is the outermost LFT in the composition. To incrementally compose the LFTs, we force the delayed cdr of the stream to get the second LFT. We compose the first LFT and the second LFT to get a new stream car, and the cdr of the delayed cdr is the new stream cdr. That is, we start with the stream {F …}, we force the tail, getting {F G …}, then we compose the first and second elements, getting {(F∘G) …}.

(defun refine-lft-stream (lft-stream)
  (let ((tail (stream-cdr lft-stream))) ;; forces cdr
    (make-instance ’stream
                   :car (compose (stream-car lft-stream) (stream-car tail))
                   :delayed-cdr (stream-delayed-cdr tail))))
so we can now write code that operates on what we have composed so far (in the car of the LFT stream), or, if we need to incrementally compose more LFTs, we repeatedly call refine-lft-stream until we have composed enough.

For example, we can compute the nearest float to an infinite composition as follows:

(defun nearest-single (lft-stream)
  (let ((f0 (funcall (stream-car lft-stream) 0))
        (finf (funcall (stream-car lft-stream) ’infinity)))
    (if (and (numberp f0)
             (numberp finf)
             (= (coerce f0 ’single-float)
                (coerce finf ’single-float)))
        (coerce f0 ’single-float)
        (nearest-single (refine-lft-stream lft-stream)))))

The problem with infinite compositions is that they may never narrow enough to proceed with a computation. For example, suppose an infinite composition converged to a number exactly in between two adjacent floating point numbers. The upper limit of the narrowing range will always round to the float that is above, while the lower limit will always round to the float that is below. The floats will never be equal no matter how many terms of the infinite composition are composed, so functions like nearest-single will never return a value. This is a tradeoff. We either continue computing, perhaps forever, with more and more precise values, or we stop at some point, perhaps giving an incorrect answer.

Operating on Infinite Compositions

You can compose a LFT with an infinite composition of LFTs. If we compose the LFT F with the infinite composition {G H …}, we can represent this one of two ways. Either we just tack the F on to the front, getting {F G H …}, or we continue further and eagerly compose the first two elements {(F∘G) H …}. If F is, for example, a LFT that adds 10 or multiplies by 7, the effect is to add 10 to or multiply by 7 the infinite composition. In this way, we can do arithmetic involving rational numbers and infinite compositions.

Sources of Infinite Compositions

It isn’t likely that you have a lot of ad hoc LFT streams you want to compose. Instead, we want some sources of infinite compositions. There are a few useful functions of finite arguments that return infinite compositions. I got these from Peter Potts's thesis. While most of these have limited ranges of arguments, you can use identity operations to divide down the input to an acceptable range and multiply up the output to the correct answer.

√(p/q)

Reinhold Heckmann came up with this one:

(defun %sqrt-rat (p q)
  (let ((diff (- p q)))
    (labels ((rollover (num den)
               (let ((d (+ (* 2 (- den num)) diff)))
                 (if (> d 0)
                     (cons-lft-stream (make-lft 1 0 1 2) (rollover (* num 4) d))
                     (cons-lft-stream (make-lft 2 1 0 1) (rollover (- d) (* den 4)))))))
       (rollover p q))))

The LFTs that this returns are curious in that when you compose them with a range, they divide the range in two and select one or other (high or low) segment. The outermost LFT in the infinite composition will represent a range that contains the square root, and the subsequent LFTs will narrow it down by repeatedly bifurcating it and selecting the top or bottom segment.

ex, where x is rational and 1/2 < x ≤ 2

(defun %exp-rat (x)
  (check-type x (rational (1/2) 2))
  (cons-lft-stream
    (make-lft (+ 2 x) x
              (- 2 x) x)
    (lft-stream-map (lambda (n)
                      (make-lft (+ (* 4 n) 2) x
                                x             0))
                    (naturals))))

log(x), where x is rational and x > 1

(defun %log-rat (x)
  (check-type x (rational (1) *))
  (lft-stream-map
    (lambda (n)
      (funcall (make-lft 0             (- x 1)
                         (- x 1) (+ (* n 2) 1))
               (make-lft 0       (+ n 1)
                         (+ n 1)       2)))
    (integers)))

xy, where x and y are rational and x > 1 and 0 < y < 1

(defun %rat-pow (x y)
  (check-type x (rational (1) *))
  (check-type y (rational (0) (1)))
  (cons-lft-stream
    (make-lft y 1
              0 1)
    (lft-stream-map
     (lambda (n)
       (funcall (make-lft 0             (- x 1)
                          (- x 1) (- (* n 2) 1))
                (make-lft 0       (- n y)
                          (+ n y)       2)))
     (naturals))))

tan(x), where x is rational and 0 < x ≤ 1

(defun %rat-tan (x)
  (check-type x (rational (0) 1))
  (lft-stream-map
   (lambda (n)
     (funcall (make-lft 0 x
                        x (+ (* n 4) 1))
              (make-lft 0 x
                        x (- (* n -4) 3))))
   (integers)))

tan-1(x), where x is rational and 0 < x ≤ 1

(defun big-k-stream (numerators denominators)
  (cons-lft-stream (make-lft 0 (stream-car numerators)
                             1 (stream-car denominators))
                   (big-k-stream (stream-cdr numerators) (stream-cdr denominators))))

(defun %rat-atan (z)
  (check-type z (rational (0) 1))
  (let ((z-squared (square z)))
    (cons-lft-stream (make-lft 0 z
                               1 1)
                     (big-k-stream (stream-map (lambda (square)
                                                 (* z-squared square))
                                               (squares))
                                   (stream-cdr (odds))))))

These functions have limited applicability. In my next couple of posts, I’ll show some functions that transform LFT streams into other LFT streams, which can be used to increase the range of inputs and outputs of these primitive sources.

Sunday, January 28, 2024

Exponentiating Functions

If we compose a function F(x) with itself, we get F(F(x)) or F∘F. If we compose it again, we get F(F(F(x))) or F∘F∘F. Rather than writing ‘F’ over and over again, we can abuse exponential notation and write (F∘F∘F) as F3, where the superscript indicates how many times we compose the function. F1 is, naturally, just F. F0 would be zero applications of F, which would be the function that leaves its argument unchanged, i.e. the identity function.

The analogy with exponentiation goes deeper. We can quickly exponentiate a number with a divide and conquer algorithm:

(defun my-expt (base exponent)
  (cond ((zerop exponent) 1)
        ((evenp exponent) (my-expt (* base base) (/ exponent 2)))
        (t (* base (my-expt base (- exponent 1))))))
The analagous algorithm will exponentiate a function:
(defun fexpt (f exponent)
  (cond ((zerop exponent) #'identity)
        ((evenp exponent) (fexpt (compose f f) (/ exponent 2)))
        (t (compose f (fexpt f (- exponent 1))))))

The function (lambda (x) (+ 1 (/ 1 x))) takes the reciprocal of its input and adds one to it. What happens if you compose it with itself? We can rewrite it as a linear fractional transformation (LFT) (make-lft 1 1 1 0) and try it out:

> (make-lft 1 1 1 0)
#<LFT 1 + 1/x>

> (compose * (make-lft 1 1 1 0))
#<LFT (2x + 1)/(x + 1)>

> (compose * (make-lft 1 1 1 0))
#<LFT (3x + 2)/(2x + 1)>

> (compose * (make-lft 1 1 1 0))
#<LFT (5x + 3)/(3x + 2)>

> (compose * (make-lft 1 1 1 0))
#<LFT (8x + 5)/(5x + 3)>

> (compose * (make-lft 1 1 1 0))
#<LFT (13x + 8)/(8x + 5)>

> (compose * (make-lft 1 1 1 0))
#<LFT (21x + 13)/(13x + 8)>
Notice how the coefficients are Fibonacci numbers.

We can compute Fibonacci numbers efficiently by exponentiating the LFT 1 + 1/x.

> (fexpt (make-lft 1 1 1 0) 10)
#<LFT (89x + 55)/(55x + 34)>

> (fexpt (make-lft 1 1 1 0) 32)
#<LFT (3524578x + 2178309)/(2178309x + 1346269)>
Since we’re using a divide and conquer algorithm, raising to the 32nd power involves only five matrix multiplies.

Fixed points

If an input x maps to itself under function f, we say that x is a fixed point of f. So suppose we have a function f with a fixed point x. We consider the function f’ which ignores its argument and outputs x. If we compose f with f’, it won’t make difference that we run the result through f again, so f’ = f∘f’ = f. You can find a fixed point of a function by composing the function with its fixed point. Unfortunately, that only works in a lazy language, so you have two options: either choose a finite number of compositions up front, or compose on demand.

You can approximate the fixed point of a function by exponentiating the function to a large number.

(defun approx-fixed-point (f) 
  (funcall (fexpt f 100) 1))

> (float (approx-fixed-point (make-lft 1 1 1 0)))
1.618034

> (float (funcall (make-lft 1 1 1 0) *))
1.618034

Alternatively, we could incrementally compose f with itself as needed. To tell if we are done, we need to determine if we have reached a function that ignores its input and outputs the fixed point. If the function is a LFT, we need only check that the limits of the LFT are equal (up to the desired precision).

(defun fixed-point (lft)
  (let ((f0   (funcall lft 0))
        (finf (funcall lft ’infinity)))  
    (if (and (numberp f0)
             (numberp finf)
             (= (coerce f0 ’single-float) 
                (coerce finf ’single-float)))
        (coerce f0 ’single-float)
        (fixed-point (compose lft lft)))))

> (fixed-point (make-lft 1 1 1 0))
1.618034