Monday, July 28, 2025

Pseudo

I was wondering what it would look like if a large language model were part of your programming language. I'm not talking about calling the model as an API, but rather embedding it as a language construct. I came up with this idea as a first cut.

The pseudo macro allows you to embed pseudocode expressions in your Common Lisp code. It takes a string description and uses an LLM to expand it into an s-expression. You can use pseudo anywhere an expression would be expected.

(defun my-func (a b)
  (pseudo "multiply b by factorial of a."))
MY-FUNC

(my-func 5 3)
360

(defun quadratic (a b c)
  (let ((d (sqrt (pseudo "compute discriminant of quadratic equation"))))
    (values (/ (+ (- b) d) (* 2 a)) (/ (- (- b) d) (* 2 a)))))
QUADRATIC

(quadratic 1 2 -3)
1.0
-3.0

The pseudo macro gathers contextual information and packages it up in a big set of system instructions to the LLM. The instructions include

  • the lexically visible variables in the macro environment
  • fbound symbols
  • bound symbols
  • overall directives to influence code generation
  • directives to influence the style of the generated code (functional vs. imperative)
  • directives to influence the use of the loop macro (prefer vs. avoid)
  • the source code of the file currently being compiled, if there is one

pseduo sets the LLM to use a low temperature for more predictable generation. It prints the “thinking” of the LLM.

Lisp is a big win here. Since Lisp's macro system operates at the level of s-expressions, it has more contextual information available to it than a macro system that is just text expansion. The s-expression representation means that we don't need to interface with the language's parser or compiler to operate on the syntax tree of the code. Adding pseudo to a language like Java would be a much more significant undertaking.

pseudo has the usual LLM caveats:

  • The LLM is slow.
  • The LLM can be expensive.
  • The LLM can produce unpredictable and unwanted code.
  • The LLM can produce incorrect code; the more precise you are in your pseudocode, the more likely you are to get the results you want.
  • You would be absolutely mad to use this in production.

pseudo has one dependency on SBCL which is a function to extract the lexically visible variables from the macro environment. If you port it to another Common Lisp, you'll want to provide an equivalent function.

pseudo was developed using Google's Gemini as the back end, but there's no reason it couldn't be adapted to use other LLMs. To try it out, you'll need the gemini library, available at https://github.com/jrm-code-project/gemini, and a Google API key.

Download pseudo from https://github.com/jrm-code-project/pseudo.

You'll also need these dependencies.

If you try it, let me know how it goes.

2 comments:

Josh Ballanco said...

Great minds think alike?

I just presented at JuliaCon and included a similar concept in my presentation (code at https://github.com/jballanc/ElegantREPL.jl). Still very early stages of the idea, and I was attempting to make everything work locally with Ollama, but I plan to expand on it. Eventually, I'd like to be able to write a macro like `plan"A type to hold date information and a series of functions to manipulate those dates, output them, and calculate ranges"`; have that expand into a series of `generate"A struct with members year, month, day, hour, minute...` etc., and then expand those (after review) into code.

I agree, though, that Lisps (which include Julia ;-) ) should have a distinct advantage thanks to reflection, introspection, simplified parsing, and metaprogramming.

Joe Marshall said...

The LLM seems to perform better when the scope is limited. At the
small expression level, it is constrained enough to be able to produce
reliable and predictable results, but if you ask it to produce an
entire definition or a library of definitions, you immediately run
into the problem of wondering what the LLM has decided to name the
function or functions. For this reason, I instruct the LLM to not
expand into anything beginning with "def".

I thought about expanding the system to handle class definitions, but
the main problem here is that it would rely on the user to provide the
names of the slots, and after you've provided that, you're essentially
done anyway.