Tuesday, March 25, 2025

Vibe Coding in Common Lisp, continued

I unwedged the AI with regard to the package system, so I asked the AI to write the game loop.

Now there are a number of ways to approach a game loop, but there is one strategy that really wins: a nested set of with-... macros. These win because they tie resource management to the dynamic execution scope. You write with-window, and upon entry to the form, a window is allocated and initialized and comes into scope during the body of the code. When the body returns, the window is destroyed and deallocated.

These with-... macros are built around allocation/deallocation pairs of primitives that are called from an unwind protect. The abstraction is the inverse of a function call: instead using a function to hide a body of code, you use a function to wrap a body of code. The body of code doesn’t hide in the callee, but is passed as an argument from the caller. One important feature of programming in this way is that resources are never returned from a function, but are only passed downwards from the allocation point. This keeps objects from escaping their dynamic scope.

The entry to the game loop consists of a few nested with-... macros that initialize the library, allocate a window, allocate a drawing context, and enter a event loop. When the event loop exits, the resources are torn down in reverse order of allocation leaving the system in a clean state.

But the AI did not use with-... macros. The code it generated had subroutines that would create a window or allocate a drawing context, but it would assign the created objects into a global variable. This means that the object is set to escape its dynamic scope when it is created. There is nothing to prevent (or even discourage) access to the object after it has been deallocated. There were no unwind-protects anywhere, so objects, once allocated, were eternal — you could never close a window.

In the large, the code was built to fail. In the small, it immediately failed. Calling conventions were not even followed. Keyword agument functions were called with positional arguments, or with an odd number of arguments, irrelevant extra arguments were passed in, the AI would pass in flags that didn’t exist. We’ll grant that the AI does not ultimately understand what it is doing, but it should at least make the argument lists superficially line up. That doesn’t require AI, a simple pattern match can detect this.

The event loop did not load, let alone compile. It referred to symbols that did not exist. We’ll we can expect this, but it needs to be corrected. When I pointed out that the symbol didn’t exist, the AI began to thrash. It would try the same symbol, but with asterisks around it. It would try a variant of the same symbol. Then it would go back and try the original symbol again, try the asterisks again, try the same variant name again, etc. There is nothing to be done here but manual intervention.

There are some macros that set up an event loop, poll for an event, disptach to some code for that event while extracting the event particulars. You can roll your own event loop, or you can just use one of pre-built macros. When the AI began to thrash on the event loop, I intervened, deleted the code it was thrashing on and put in the event loop macro. The AI immediately put back in the code I had removed and started thrashing on it again.

Again, it is clear that the AI has no knowledge at all of what it is doing. It doesn’t understand syntax or the simplest of semantics. It cannot even tell if a symbol is bound to a value. Even the most junior developer won’t just make up functions that are not in the library. The AI doesn’t consult the documentation to validate if the generated code even remotely passes a sniff test.

You cannot “vibe code” Common Lisp. The AI begins to thrash and you simply must step in to get it unwedged. It doesn’t converge to any solution whatsoever. I suspect that this is because there is simply not enough training data. Common Lisp would appear to need some semantic understanding in order to write plausibly working code. Just mimicking some syntax you found on the web (which is ultimately what the AI is doing) will not get you very far at all.

1 comment:

fadrian said...

You are correct as to the cause of the problem - there simply isn't enough CL code out there to drive an answer through the LLM. I've had the same issue with Clojure. It's good for finding micro-functions that could have been found through web searching, but anything larger than that requires actual understanding - something that LLMs are notoriously bad at.

I think the larger worry is that if vibe coding becomes a thing, this will add even more emphasis to using top-ten languages, to the detriment of lesser-used, but more powerful, languages.

But even at it's best, vibe coding requires the prompter to carefully guide the AI to come up with code that is architecturally sound. This has been true when I've used it to generate Python code and I assume it would be true even if some other highly-used language (JavaScript, Java, SQL, etc.) were being generated.

So anyway, given its level of "expertise", which I would describe as "entry-level programmer with access to Stack Overflow", I'd like to say that this method will be little more than a fad. But sadly, much code is written by people who have this level of expertise, so I'm afraid that it's here to stay. Again, this has the probable outcome of making lesser-used languages even more lesser-used and more badly-written code being out there, leading to the detriment of the languages I love and of the industry as a whole. The only good news is once the mess is big enough, there will be good consultancy money in cleaning it up.