Wednesday, May 28, 2025

Vibe Coding Common Lisp Through the Back Door

I had no luck vibe coding Common Lisp, but I'm pretty sure I know the reasons. First, Common Lisp doesn't have as much boilerplate as other languages. When boilerplace accumulates, you write a macro to make it go away. Second, Common Lisp is not as popular language as others, so there is far less training data.

Someone made the interesting suggestion of doing this in two steps: vibe code in a popular language, then ask the LLM to translate the result into Common Lisp. That sounded like it might work, so I decided to try it out.

Again, I used "Minesweeper" as the example. I asked the LLM to vibe code Minesweeper in Golang. Golang has a lot of boilerplate (it seems to be mostly boilerplate), and there is a good body of code written in Golang.

The first problem was that the code expected assets of images of the minesweeper tiles. I asked the LLM to generate them, but it wasn't keen on doing that. It would generate a large jpeg image of a field of tiles, but not a set of .png images of the tiles.

So I asked the LLM to vibe code a program that would generate the .png files for the tiles. It took a couple of iterations (the first time, the digits in the tiles were too small to read), but it eventually generated a program which would generate the tiles.

Then I vibe coded minesweeper. As per the philosophy of vibe coding, I did not bother writing tests, examining the code, or anything. I just ran the code.

Naturally it didn't work. It took me the entire day to debug this, but there were only two problems. The first was that the LLM simply could not get the API to the image library right. It kept thinking the image library was going to return an integer error code, but the latest api returns an Error interface. I could not get it to use this correctly; it kept trying to coerce it to an integer. Eventually I simply discarded any error message for that library and prayed it would work.

The second problem was vexing. I was presented with a blank screen. The game logic seemed to work because when I clicked around on the blank screen, stdout would eventually print "Boom!". But there were no visuals. I spent a lot of time trying to figure out what was going on, adding debugging code, and so on. I finally discovered that the SDL renderer was simply not working. It wouldn't render anything. I asked the LLM to help me debug this, and I went down a rabbit hole of updating the graphics drivers, reinstalling SDL, reinstalling Ubuntu, all to no avail. Eventually I tried using the SDL2 software renderer instead of the hardware accelerated renderer and suddenly I had graphics. It took me several hours to figure this out, and several hours to back out my changes tracking down this problem.

Once I got the tiles to render, though, it was a working Minesweeper game. It didn't have a timer and mine count, but it had a playing field and you could click on the tiles to reveal them. It had the look and feel of the real game. So you can vibe code golang.

The next task was to translate the golang to Common Lisp. It didn't do as good a job. It mentioned symbols that didn't exist in packages that didn't exist. I had to make a manual pass to replace the bogus symbols with the nearest real ones. It failed to generate working code that could load the tiles. I looked at the Common Lisp code and it was a horror. Not suprisingly, it was more or less a transliteration of the golang code. It took no advantage of any Common Lisp features such as unwind-protect. Basically, each and every branch in the main function had its own duplicate copy of the cleanup code. Since the tiles were not loading, I couldn't really test the game logic. I was in no mood to debug the tile loading (it was trying to call functions that did not exist), so I left it there.

This approach, vibe in golang and then translate to Common Lisp, seems more promising, but with two phase of LLM coding, the probability of a working result gets pretty low. And you don't really get Common Lisp, you get something closer to fully parenthesized golang.

I think I am done with this experiment for now. When I have some agentic LLM that can drive Emacs, I may try it again.

Tuesday, May 27, 2025

Thoughts on LLMs

I've been exploring LLMs these past few weeks. There is a lot of hype and confusion about them, so I thought I'd share some of my thoughts.

LLMs are a significant step forward in machine learning. They have their limitations (you cannot Vibe Code in Common Lisp), but on the jobs they are good at, they are uncanny in their ability. They are not a replacement for anything, actually, but a new tool that can be used in many diverse and some unexpected ways.

It is clear to me that LLMs are the biggest thing to happen in computing since the web, and I don't say that lightly. I am encouraging everyone I know to learn about them, gain some experience with them, and learn their uses and limitations. Not knowing how to use an LLM will be like not knowing how to use a web browser.

Our company has invested in GitHub Copilot, which is nominally aimed at helping programmers write code, but if you poke around at it a bit, you can get access to the general LLM that underlies the code generation. Our company enables the GPT-4.1, Claude Sonnet 3.5, and Claude Sonnet 3.7 models. There is some rather complicated and confusing pricing for these models, and our general policy is to push people to use GPT-4.1 for the bulk of their work and to use Claude Sonnet models on an as-needed basis.

For my personal use, I decided to try a subscription to Google Gemini. The offering is confusing and I am not completely sure which services I get at which ancillary cost. It appears that there is a baseline model that costs nothing beyond what I pay for, but there are also more advanced models that have a per-query cost above and beyond my subscription. It does not help that there is a "1 month free trial", so I am not sure what I will eventually be billed for. (I am keeping an eye on the billing!) A subscription at around $20 per month seems reasonable to me, but I don't want to run up a few hundred queries at $1.50 each.

The Gemini offering appears to allow unlimited (or a large limit) of queries if you use the web UI, but the API incurs a cost per query. That is unfortunate because I want to use Gemini in Emacs.

Speaking of Emacs. There are a couple of bleeding edge Emacs packages that provide access to LLMs. I have hooked up code completion to Copilot, so I get completion suggestions as I type. I think I get these at zero extra cost. I also have Copilot Chat and Gemini Chat set up so I can converse with the LLM in a buffer. The Copilot chat uses an Org mode buffer whereas the Gemini chat uses a markdown buffer. I am trying to figure out how to hook up emacs as an LLM agent so that that the LLM can drive Emacs to accomplish tasks. (Obviously it is completely insane to do this, I don't trust it a bit, but I want to see the limitations.)

The Gemini offering also gives you access to Gemini integration with other Google tools, such as GMail and Google Docs. These look promising. There is also AIStudio in addition to the plain Gemini web UI. AIStudio is a more advanced interface that allows you to generate applications and media with the LLM.

I am curious about Groq. They are at a bit of a disadvantage in that they cannot integrate with Google Apps or Microsoft Apps as well as Gemini and Copilot can. I may try them out at some time in the future.

I encourage everyone to try out LLMs and gain some experience with them. I think we're on the exponential growth part of the curve at this point and we will see some amazing things in the next few years. I would not wait too long to get started. It will be a significant skill to develop.

Sunday, May 25, 2025

Roll Your Own Bullshit

Many people with pointless psychology degrees make money by creating corporate training courses. But you don't need a fancy degree to write your own training course. They all follow the same basic format:

  1. Pick any two axes of the Myers-Briggs personality test.
  2. Ask the participants to answer a few questions designed to determine where they fall on those axes. It is unimportant what the answers are, only that there is a distribution of answers.
  3. Since you have chosen two axes, the answers will, by necessity, fall into four quadrants. (Had you chosen three axes, you'd have eight octants, which is too messy to visualize.)
  4. Fudge the median scores so that the quadrants are roughly equal in size. If you have a lot of participants, you can use statistical methods to ensure that the quadrants are equal, but for small groups, just eyeball it.
  5. Give each quadrant a name, like “The Thinkers”, “The Feelers”, “The Doers”, and “The Dreamers”. It doesn't matter what you call them, as long as they sound good.
  6. Assign to each quadrant a set of traits that are supposed to be broad stereotypes of people in that quadrant. Again, it doesn't matter what you say, as long as it sounds good. Pick at least two positive and two negative traits for each quadrant.
  7. Assign each participant to a quadrant based on their answers.
  8. Have the participants break into focus groups for their quadrants and discuss among themselves how their quadrant relates to the other quadrants.
  9. Break for stale sandwiches and bad coffee.
  10. Have each group report back to the larger group, where they restate what they discussed in their focus groups.
  11. Conclude with a summary of the traits of each quadrant, and how they relate to each other. This is the most important part, because it is where you can make up any bullshit you want, and nobody will be able to call you on it. Try to sound as if you have made some profound insights.
  12. Have the participants fill out a survey to see how they feel about the training. This is important, because it allows you to claim that the training was a success.
  13. Hand out certificates of completion to all participants.
  14. Profit.

It is a simple formula, and it is apparently easy to sell such courses to companies. I have attended several of these courses, and they all follow this same basic formula. They are all a waste of time, and they are all a scam.

Saturday, May 24, 2025

More Bullshit

Early on in my career at Google I attended an offsite seminar for grooming managers. We had the usual set of morning lectures, and then we were split into smaller groups for discussion. The topic was “What motivates people?” Various answers were suggested, but I came up with these four: Sex, drugs, power, and money.

For some reason, they were not happy with my answer.

They seemed to think that I was not taking the question seriously. But history has shown that these four things are extremely strong motivations. They can motivate people to do all sorts of things they wouldn't otherwise consider. Like, for example, to betray their country.

If you look for them, you can find the motivations I listed. They are thinly disguised, of course. Why are administrative assistants so often attractive young women? Is Google's massage program completely innocent? Are there never any “special favors”? The beer flows pretty freely on Fridays, and the company parties were legendary. Of course there are the stock options and the spot bonuses.

I guess I was being too frank for their taste. They were looking for some kind of “corporate culture” answer, like “team spirit” or “collaboration”. I was just being honest.

I soon discovered that the true topic of this three day offsite seminar was bullshit. It was a “course” dreamed up by business “psychologists” and peddled to Silicon Valley companies as a scam. If you've ever encountered “team building” courses, you'll know what I mean. It is a lucrative business. This was just an extra large helping.

They didn't like my frankness, and I didn't see how they could expect to really understand what motivates people if they were not going to address the obvious. They thought I wasn't being serious with my answer, but it was clear that they weren't going to be seriously examining the question either.

So I left the seminar. I didn't want to waste my time listening to corporate psychobabble. My time was better spent writing code and engineering software. I returned to my office to actually accomplish some meaningful work. No doubt I had ruined any chance of becoming a big-wig, but I simply could not stomach the phoniness and insincerity. I was not going to play that game.

Tuesday, May 20, 2025

Management = Bullshit

The more I have to deal with management, the more I have to deal with bullshit. The higher up in the management chain, the denser the bullshit. Now I'm not going to tell you that all management is useless, but there is a lot more problem generation than problem solving.

Lately I've been exploring the potentials of LLMs as a tool in my day-to-day work. They have a number of technical limitations, but some things they excel at. One of those things is generating the kinds of bullshit that management loves to wallow in. Case in point: our disaster recovery plan.

Someone in management got it into their head that we should have a formal disaster recovery plan. Certainly this is a good idea, but there are tradeoffs to be made. After all, we have yearly fire drills, but we don't practice "duck and cover" or evacuation in case of flooding. We have a plan for what to do in case of a fire, but we don't have a plan for what to do in case of a zombie apocalypse. But management wants a plan for everything, no matter how unlikely.

Enter the LLM. It can generate plans like nobody's business. It can generate a plan for what to do in case of a fire, a meteor strike, or a zombie apocalypse. The plans are useless, naturally. They are just bullshit. But they satisfy management's jonesing for plans, and best of all, they require no work on my part. It saved me hours of work yesterday.

Tuesday, May 13, 2025

Purchasing White Elephants

As a software engineer, I'm constantly trying to persuade management to avoid doing stupid things. Management is of the opinion that because they are paying the engineers anyway, the software is essentially free. In my experience, bespoke software is one of the most expensive things you can waste money on. You're usually better off setting your money on fire than writing custom software.

But managers get ideas in their heads and it falls upon us engineers to puncture them. I wish I were less ethical. I'd just take the money and spend it as long as it kept flowing. But I wouldn't be able to live with myself. I have to at least try to persuade them to avoid the most egregious boondoggles. If they still insist on doing the project, well, so be it.

I'm absolutely delighted to find that these LLMs are very good at making plausible sounding proposals for software projects. I was asked about a project recently and I just fed the parameters into the LLM and asked it for an outline of the project, estimated headcount, time, and cost. It suggested we could do it in 6 months with 15 engineers at a cost of $3M. (I think it was more than a bit optimistic, frankly, but it was a good start.) It provided a phased breakdown of the project and the burn rate. Management was curious about how long it would take 1 engineer and the LLM suggested 3-6 years.

Management was suitably horrified.

I've been trying to persuade them that the status quo has been satisfying our needs, costs nothing, needs no engineers, and is ready today, but they didn't want to hear it. But now they are starting to see the light.

Thursday, May 1, 2025

It Still Sucks

Don’t get me wrong. I”m not saying that the alternatives are any better or even any different.

Unix has been around more than forty years and it is still susceptible to I/O deadlock when you try to run a subprocess and stream input to it and output from it. The processes run just fine for a while, then they hang indefinitely waiting for input and output from some buffer to synchronize.

I’m trying to store data in a database. There aren't any good database bindings I could find, so I wrote a small program that reads a record from stdin and writes it to the database. I launch this program from Common Lisp and write records to the input of the program. It works for about twenty records and then hangs. I've tried to be careful to flush and drain all streams from both ends, to no avail.

I have a workaround: start the program, write one record, and quit the program. This doesn’t hang and reliably writes a record to the database, but it isn’t fast and it is constantly initializing and creating a database connection and tearing it back down for each record.

You'd think that subprocesses communicating via stream of characters would be simple.