Thursday, September 4, 2025

Gemini

I chose to experiment with the Gemini LLM. I figured that Google is likely to remain competitive in AI for a while. I didn't want to be chasing the latest and greatest LLM every few weeks.

Requirements

You will need these prerequisites

You will need a Google Cloud account and a project with the Gemini API enabled. You will need to create an API key.

Put your Gemini API key in ~/.config/googleapis/{project}/Gemini/apikey

Create a directory ~/Gemini/ to hold Gemini related files. In this directory, create subdirectory ~/Gemini/transcripts/ to hold conversation logs.

Usage

Basic Usage

+default-model+
The default Gemini model to use. You can set this to any model supported by your Google Cloud project. It starts as "gemini-2.5-flash".

invoke-gemini prompt
Invoke Gemini with prompt and return the response as a string. An empty context will be used. Use this to start new conversations.

continue-gemini prompt
Invoke Gemini with prompt and return the response as a string. The accumulated context will be used. Use this to continue your conversation.

Conversations will be logged in ~/Gemini/transcripts/ in files named NNNNNNNNNN-NN.txt. The first number being the universal time that the conversation started, the second number being the number of turns in the conversation.

Advanced Usage

Results generation can be controlled via these special variables:

  • *CANDIDATE-COUNT*
  • *ENABLE-ADVANCED-CIVIC-ANSWERS*
  • *FREQUENCY-PENALTY*
  • *INCLUDE-THOUGHTS*
  • *LANGUAGE-CODE*
  • *LOGPROBS*
  • *MAX-OUTPUT-TOKENS*
  • *MEDIA-RESOLUTION*
  • *MULTI-SPEAKER-VOICE-CONFIG*
  • *PREBUILT-VOICE-CONFIG*
  • *PRESENCE-PENALTY*
  • *RESPONSE-JSON-SCHEMA*
  • *RESPONSE-LOGPROBS*
  • *RESPONSE-MIME-TYPE*
  • *RESPONSE-MODALITIES*
  • *RESPONSE-SCHEMA*
  • *SAFETY-SETTINGS*
  • *SEED*
  • *SPEECH-CONFIG*
  • *STOP-SEQUENCES*
  • *SYSTEM-INSTRUCTION*
  • *TEMPERATURE*
  • *THINKING-BUDGET*
  • *THINKING-CONFIG*
  • *TOOL-CONFIG*
  • *TOOLS*
  • *TOP-K*
  • *TOP-P*
  • *VOICE-CONFIG*
  • *VOICE-NAME*

See the Gemini API documentation for details on what these do. If you leave them alone, you will get the default values. I suggest binding *include-thoughts* to T in order to have thoughts printed during processing and *temperature* to a value between 0 (very deterministic) and 2 (very random) if you want something other than the default value of 1. Don't fool around with the other values unless you know what you are doing.

Prompts

The default prompt is just a string. This is so that you can use the LLM as a string to string mapping. However, you can supply a fully structured content object with multiple parts if you wish. For example, you could supply

(content
    (part "Describe this function.")
    (part "```lisp
(defun foo (x) (+ x 3))
```"))

Tools

  • *enable-bash* — enable the shell command tool.
  • *enable-eval* — enable the lisp interpreter tool.
  • *enable-interaction* — allow the LLM to query the user for input.
  • *enable-lisp-introspection — allow the LLM to query if symbols are bound or fbound and to query ASDF and Quicklisp modules and load them.
  • *enable-personality* — give the LLM a pseudorandom personality.
  • *enable-web-functions — allow the LLM to perform HTTP GET and POST commands.
  • *enable-web-search — allow the LLM to perform web searches.

Personality

Unless you disable personality (default is enabled), the LLM will be instructed to respond in a randomly selected literary or celebrity style. This is just for amusement. Some personalities are quite obnoxious. Personalities change every day.

without-personality &body body
disable personality while executing body.

new-personality
Select a new personality at random. Use this if you find the current personality too obnoxious.

personalities.txt personalities are selected from this file. Each line contains a personality description. Blank lines and comments are ignored.

MCP

Currently only stdio MCP servers are supported.

~/.config/mcp/mcp.lisp contains the configurations for any MCP servers you wish to supply to the model. Here is an example file:

((:mcp-servers

  ("memory"
   (:command "npx")
   (:args "-y" "@modelcontextprotocol/server-memory")
   (:env ("MEMORY_FILE_PATH" . "/home/jrm/memory.json"))
   (:system-instruction "You have access to a persistent memory that stores entities and relationships between them.  You can add, update, delete, and query entities and relationships as needed."))

  ("sequential-thinking"
   (:command "npx")
   (:args "-y" "@modelcontextprotocol/server-sequential-thinking")
   (:system-instruction "You can use this tool to break down complex tasks into smaller, manageable steps.  You can create, view, and complete steps as needed."))

  ("mcp-server-time"
   (:command "uvx")
   (:args "mcp-server-time")
   (:system-instruction "You have access to the current date and time along with time zone information."))
  ))

Wednesday, September 3, 2025

Google API

To experiment with LLMs I chose to use Gemini. The cost is modest and I expect Google to regularly improve the model so that it remains competitive. I wrote a Gemini client in Common Lisp. Some people have already forked the repo, but I've refactored things recently so I thought I'd give a quick rundown of the code.

google repository

https://github.com/jrm-code-project/google

When I add APIs to Google, I intend to put them in this repository. It is pretty empty so far, but here is where they will go. Contributions welcome.

Basic Requirements

There are some basic libraries you can load with Zach Beane's Quicklisp. alexandria for some selected Common Lisp extensions, cl-json for JSON parsing, dexador for HTTP requests, and str for string manipulation. I also use series (available from SourceForge). My own fold, function, named-let, and jsonx libraries are also used (available from GitHub).

fold gives you fold-left for list reduction. function gives you compose. jsonx modifies cl-json to remove the ambiguity betweeen JSON objects and nested arrays. named-let gives you the named let syntax from Scheme for local iteration and recursion.

Google API

The google repository contains the beginnings of a Common Lisp client to Google services. It currently only supports an interface to Blogger that can return your blog content as a series of posts and an interface to Google's Custom Search Engine API. You create a Google services account (there is a free tier, and a paid tier if you want more). You create a project (you'll want at least a default project), and within that project you enable the Google services that you want that project to have access to.

API Keys and Config Files

For each service you can get an API key. In your XDG config directory (usually "~/.config/") you create a googleapis directory and subdirectories for each project you have created. The ~/.config/googleapis/default-project file contains the name of your default project. In the examples below, it contains the string my-project.

Within each project directory you create subdirectories for each service you want to use. In each service directory you create a file to hold the API key and possibly other files to hold IDs, options, and other config information. For example, for Blogger you create ~/.config/googleapis/my-project/Blogger/apikey to hold your Blogger API key and ~/.config/googleapis/my-project/Blogger/blog-id to hold your blog ID.

Here is a sample directory structure:

  ~/.config/googleapis/
    ├── my-project/
    │   ├── Blogger/
    │   │   ├── apikey
    │   │   └── blog-id
    │   ├── CustomSearchEngine/
    │   │   ├── apikey
    │   │   ├── hyperspec-id
    │   │   └── id
    │   └── Gemini/
    │       └── apikey
    └── default-project

When you create the API keys for the various services, it is a good idea to restrict the API key for only that service. That way, if the API key is compromised, the damage is limited to a single service.

Blogger

The blogger API will eventually have more features, but it currently only allows you to get the posts from a blog as a series.

scan-blogger-posts blog-id
Returns a series of posts from the blog with the given ID. Each post comes back as a JSON object (hash-table). The :content field contains the HTML content of the post.

Custom Search Engine

The Custom Search Engine API allows you to search a set of web pages. You create a search engine on Google and pass the ID custom-search. It is assumed that the default CustomSearchEngine id is a default, vanilla search, and that the hyperspec id searches the Common Lisp Hyperspec.

In each search, be sure to replace the spaces in the query with plus signs (+) before searching.

custom-search query &key custom-search-engine-id
Returns search results for query. Each search result comes back as a JSON object (hash-table).

web-search query
Search the web and return search results for query. Each search result comes back as a JSON object (hash-table).

hyperspec-search query
Search the Common Lisp Hyperspec and return search results for query. Each search result comes back as a JSON object (hash-table).

Monday, September 1, 2025

Lisp Still Matters

Lisp is not the most popular computer language, yet it continues to attract a small but devoted following. Sure, you can find nutters for any language, but a lot of Lisp programmers are high power expert programmers. Lisp was developed by the world's top researchers and was a language of choice at places like MIT, Stanford, Carnegie-Mellon, Xerox PARC and Bell Labs. It may be a niche, but your companions in the niche are some of the best programmers in the world. What is it about Lisp that attracts the top talent?

Lisp is a powerful language because it can be easily extended to express new ideas that were not foreseen when the language was originally designed. The original designers were well aware that their new computer language couldn't take into account ideas that would be developed in the future, so they made it easy to extend the language beyond its original design. Lisp was based on Church's lambda calculus, which is a logical formalism designed to allow you to reason about logical formalism itself. When you extend the language, you can use the base language to reason about the extension.

The fundamental building block of lambda calculus is the function. In his Conversions of the Lambda Calculi, Church shows how to bootstrap the rest of mathematics from functions. With Lisp, McCarthy showed the isomorphism between s-expressions and lambda expressions, and developed a universal function within Lisp. Thus if you extend Lisp, you can use Lisp to implement and interpret your extension from within Lisp. You don't have to exit the language to extend it.

Lisp syntax, such as it is, is based upon s-expressions, which are tree structures. Lisp allows you to invent your own nodes in the tree and assign your own semantics to them. You can express the new semantics via function calls, which Church showed to be universal. You don't need to hack the parser or the compiler to extend the language, a simple defmacro will extend the syntax and a handful of defuns will assign semantics to the new syntax. If you are so inclined, you can use compiler macros to generate efficient code for your new constructs.

With Lisp, I have the option of expressing the solution to my problems in Lisp, or the option to extend Lisp to express the solution more naturally. I can tailor my problem to be solved in Lisp, or I can tailor Lisp to solve my problem. Or work from both ends to meet in the middle.

It works. Lately I've been using a forty year old dialect of a sixty year old language to interact with cutting edge LLMs. I've not only built an API to the LLM, but I've extended Lisp to include features implemented by the LLM. This was certainly not forseen by McCarthy et al when Lisp was first designed. I decided to use Lisp to explore the integration because I knew that Lisp was amenable to extension. I could have used another language, but I could not have whipped up a prototype or proof of concept by myself over a couple of weekends.

In conclusion, Lisp's enduring appeal to expert programmers lies in its powerful, foundational design. By allowing a programmer to extend the language from within, Lisp provides a unique environment for tackling novel problems and developing new abstractions. The principles of lambda calculus and the flexibility of s-expressions empower developers to not only solve problems but also to build the perfect tool for the job. This capacity to adapt and evolve, as demonstrated by its modern use with LLMs, ensures that Lisp remains a relevant and potent language for those who wish to go beyond the confines of a predefined system.

Saturday, August 30, 2025

Synergy of Tools

Once you have the sequential thinking MCP working, it can interact with the eval tool and reason about Lisp programs. I tested this by asking the LLM to determine the 30th Fibonacci number via the technique of raising a matrix to a power. The LLM was given no special instructions about how to do this or how to use Lisp. It was only told that it had access to a Common Lisp environment. The LLM was able to reason through this problem and even correct its own mistakes. The following is an annotated transcript of the process.

Now we test something hard. I want you to compute the 30th fibonacci number by raising a matrix to the 30th power.

We wrap this prompt in a content object with a "user" role, append it to the context, and send it to the LLM. Every time we interact with the LLM, we send all the accumulated context. The LLM will respond with a content object as well and we also accumulate that. The LLM itself is stateless. All the state is in this accumulated context.

I'm not going to replicate the entire accumulating context here. Instead, I will just show the most recent content object accumulated onto the context.

{"parts":
   [
    {"text": "Now we test something hard.  I want you to compute the 30th
              fibonacci number by raising a matrix to the 30th power."
    }
  ],
  "role": "user"
}

The LLM decides that five steps of sequential thinking is called for. It returns this content object.

  {
    "parts": [
      {
        "functionCall": {
          "name": "sequentialthinking",
          "args": {
            "thoughtNumber": 1,
            "thought": "First, I need to define a Common Lisp function to multiply
            two 2x2 matrices. I'll represent matrices as a list of two lists, like
            `((a b) (c d))`. The function will take two such matrices and return their product.",
            "nextThoughtNeeded": {
              "value": "true"
            },
            "totalThoughts": 5
          }
        }
      }
    ],
    "role": "model"
  }

The LLM client recognizes the "functionCall" object, so instead of returning control to the user, it tail calls the sequentialThinking tool via the MCP protocol. The sequentialThinking tool returns a result that we wrap in a response object. We append this response object to the context and send it back to the LLM.

  {
    "parts": [
      {
        "functionResponse": {
          "name": "sequentialthinking",
          "response": {
            "result": {
              "content": [
                {
                  "type": "text",
                  "text": "{\n  \"thoughtNumber\": 1,\n  \"totalThoughts\": 5,\n  \"nextThoughtNeeded\": true,\n  \"branches\": [],\n  \"thoughtHistoryLength\": 1\n}"
                }
              ]
            }
          }
        }
      }
    ],
    "role": "function"
  }

The LLM responds with a multi-part response.

{
    "parts": [
      {
        "text": "Alright, pal, we're diving into the deep end. First off, I'm gonna lay down some Lisp code to handle the grunt work of multiplying two 2x2 matrices. We'll represent 'em as lists of lists. This ain't no parlor trick; it's the foundation for what's coming next.\n"
      },
      {
        "functionCall": {
          "name": "eval",
          "args": {
            "string": "(defun matrix-multiply-2x2 (m1 m2)\n  (let* ((a11 (car (car m1))) (a12 (cadr (car m1)))\n         (a21 (car (cadr m1))) (a22 (cadr (cadr m1)))\n         (b11 (car (car m2))) (b12 (cadr (car m2)))\n         (b21 (car (cadr m2))) (b22 (cadr (cadr m2))))\n    `((,(+ (* a11 b11) (* a12 b21)) ,(+ (* a11 b12) (* a12 b22)))\n      ,(+ (* a21 b11) (* a22 b21)) ,(+ (* a21 b12) (* a22 b22)))))"
          }
        }
      }
    ],
    "role": "model"
  }

The LLM client prints the text part, but it also recognizes the function call part and tail calls the eval tool. The eval tool returns a "functionResponse" object that we append to the context and pass along to the LLM.

  {
    "parts": [
      {
        "functionResponse": {
          "name": "eval",
          "response": {
            "result": ";; Value: MATRIX-MULTIPLY-2X2\n"
          }
        }
      }
    ],
    "role": "function"
  }

The LLM realizes it is time for the next step in its sequential thinking. It returns another multi-part response.

  {
    "parts": [
      {
        "text": "The matrix multiplication function is in place. Next, we'll build the engine for raising that matrix to a power. This is where the real muscle comes in, using a trick called exponentiation by squaring to keep things snappy.\n"
      },
      {
        "functionCall": {
          "name": "sequentialthinking",
          "args": {
            "totalThoughts": 5,
            "thought": "Next, I need to define a Lisp function for matrix exponentiation using the \"exponentiation by squaring\" algorithm. This will take a 2x2 matrix and a non-negative integer power.",
            "nextThoughtNeeded": {
              "value": "true"
            },
            "thoughtNumber": 2
          }
        }
      }
    ],
    "role": "model"
  },

Sequential thinking advances to the next thought.

  {
    "parts": [
      {
        "functionResponse": {
          "name": "sequentialthinking",
          "response": {
            "result": {
              "content": [
                {
                  "type": "text",
                  "text": "{\n  \"thoughtNumber\": 2,\n  \"totalThoughts\": 5,\n  \"nextThoughtNeeded\": true,\n  \"branches\": [],\n  \"thoughtHistoryLength\": 2\n}"
                }
              ]
            }
          }
        }
      }
    ],
    "role": "function"
  }

The LLM responds with a request to call to eval.

  {
    "parts": [
      {
        "text": "Alright, pal, the multiplication engine is purring. Now we build the big gun: a Lisp function to raise that 2x2 matrix to any power you throw at it, using the \"exponentiation by squaring\" trick. This ain't no slow boat to China; we're gonna get there fast.\n"
      },
      {
        "functionCall": {
          "name": "eval",
          "args": {
            "string": "(defun matrix-power-2x2 (m n)\n  (cond\n    ((zerop n) '((1 0) (0 1))) ; Identity matrix for n=0\n    ((= n 1) m)\n    (t\n     (let* ((half-power (matrix-power-2x2 m (floor n 2)))\n            (squared (matrix-multiply-2x2 half-power half-power)))\n       (if (oddp n)\n           (matrix-multiply-2x2 m squared)\n           squared)))))"
          }
        }
      }
    ],
    "role": "model"
  }

The LLM client tail calls the eval tool again. The eval tool returns a "functionResponse" object that we append to the context and pass along to the LLM.

  {
    "parts": [
      {
        "functionResponse": {
          "name": "eval",
          "response": {
            "result": ";; Value: MATRIX-POWER-2X2\n"
          }
        }
      }
    ],
    "role": "function"
  }

The LLM advances to the next thought in its sequential thinking.

  {
    "parts": [
      {
        "text": "Alright, the power-hitter is in the lineup. Now, we're setting up the initial play: the base Fibonacci matrix. This little beauty is the key to unlocking the whole operation.\n"
      },
      {
        "functionCall": {
          "name": "sequentialthinking",
          "args": {
            "thoughtNumber": 3,
            "totalThoughts": 5,
            "nextThoughtNeeded": {
              "value": "true"
            },
            "thought": "Now I will define the base Fibonacci matrix: `((1 1) (1 0))`. Then, I will raise this matrix to the 29th power, because the top-right element of M^n gives F_n."
          }
        }
      }
    ],
    "role": "model"
  }

Once again we call sequential thinking to advance to the next thought.

  {
    "parts": [
      {
        "functionResponse": {
          "name": "sequentialthinking",
          "response": {
            "result": {
              "content": [
                {
                  "type": "text",
                  "text": "{\n  \"thoughtNumber\": 3,\n  \"totalThoughts\": 5,\n  \"nextThoughtNeeded\": true,\n  \"branches\": [],\n  \"thoughtHistoryLength\": 3\n}"
                }
              ]
            }
          }
        }
      }
    ],
    "role": "function"
  }

And once again we get a multi-part response. This response has two functionCall parts to call eval

    "parts": [
      {
        "text": "The power-hitter's ready. Now, we're setting up the initial play: the base Fibonacci matrix. This little beauty is the key to unlocking the whole operation. We'll crank it up to the 29th power, and the answer to our little puzzle will be staring us right in the face in the top-right corner.\n"
      },
      {
        "functionCall": {
          "name": "eval",
          "args": {
            "string": "(defparameter *fib-matrix* '((1 1) (1 0)))"
          }
        }
      },
      {
        "functionCall": {
          "name": "eval",
          "args": {
            "string": "(defparameter *result-matrix* (matrix-power-2x2 *fib-matrix* 29))"
          }
        }
      }
    ],
    "role": "model"
  }

The result of evaluation is as follows:

  {
    "parts": [
      {
        "functionResponse": {
          "name": "eval",
          "response": {
            "result": ";; Value: *FIB-MATRIX*\n"
          }
        }
      },
      {
        "functionResponse": {
          "name": "eval",
          "response": {
            "result": "Error: Expression could not be evaluated."
          }
        }
      }
    ],
    "role": "function"
  }

We have a problem. The second evaluation errored out. The LLM notices this and revises its sequential thinking state.

  {
    "parts": [
      {
        "text": "Alright, a snag in the wire. Looks like my attempt to power up the Fibonacci matrix hit a wall. The Lisp interpreter choked on it. This ain't right. Time to put on my magnifying glass and check the details. First, I'm gonna make sure that base matrix is even on the books, and then I'll kick the tires on that `matrix-power-2x2` function with some small numbers. We'll find out what's causing this hiccup.\n"
      },
      {
        "functionCall": {
          "name": "sequentialthinking",
          "args": {
            "isRevision": {
              "value": "true"
            },
            "thoughtNumber": 4,
            "totalThoughts": 6,
            "revisesThought": 3,
            "nextThoughtNeeded": {
              "value": "true"
            },
            "thought": "The previous eval failed. I need to debug why `(matrix-power-2x2 *fib-matrix* 29)` failed. First, I'll confirm if `*fib-matrix*` is correctly defined and then test `matrix-power-2x2` with a smaller, known input to ensure it works."
          }
        }
      }
    ],
    "role": "model"
  }

Sequential thinking returns its state.

  {
    "parts": [
      {
        "functionResponse": {
          "name": "sequentialthinking",
          "response": {
            "result": {
              "content": [
                {
                  "type": "text",
                  "text": "{\n  \"thoughtNumber\": 4,\n  \"totalThoughts\": 6,\n  \"nextThoughtNeeded\": true,\n  \"branches\": [],\n  \"thoughtHistoryLength\": 4\n}"
                }
              ]
            }
          }
        }
      }
    ],
    "role": "function"
  }

The LLM decides to debug the problem.

  {
    "parts": [
      {
        "text": "Alright, the last job went sideways. The Lisp interpreter choked when I tried to crank that matrix to the 29th power. We're on a revision here, back to the drawing board. First, I'm gonna make damn sure that `*fib-matrix*` is what it's supposed to be. Then, I'll put `matrix-power-2x2` through its paces with a smaller, simpler test. We'll find out what piece of the puzzle went missing.\n"
      },
      {
        "functionCall": {
          "name": "boundp",
          "args": {
            "symbol": "*FIB-MATRIX*"
          }
        }
      },
      {
        "functionCall": {
          "name": "printSymbolValue",
          "args": {
            "symbol": "*FIB-MATRIX*"
          }
        }
      },
      {
        "functionCall": {
          "name": "eval",
          "args": {
            "string": "(matrix-power-2x2 '((1 1) (1 0)) 2)"
          }
        }
      }
    ],
    "role": "model"
  }

The LLM uses the Lisp introspection tools we provided.

  {
    "parts": [
      {
        "functionResponse": {
          "name": "boundp",
          "response": {
            "result": {
              "value": "true"
            }
          }
        }
      },
      {
        "functionResponse": {
          "name": "printSymbolValue",
          "response": {
            "result": "((1 1) (1 0))"
          }
        }
      },
      {
        "functionResponse": {
          "name": "eval",
          "response": {
            "result": ";; Value: ((2 1) 1 1)\n"
          }
        }
      }
    ],
    "role": "function"
  }

Clearly the response for the third evaluation is an improperly formed matrix. The LLM has noticed this as well.

  {
    "parts": [
      {
        "text": "Alright, the blueprints were faulty. My `matrix-multiply-2x2` function had a screw loose, building the second row of the matrix all wrong. It was spitting out a flat list instead of a nested one, making the whole operation fall apart. I'm going back in, fixing that Lisp code to properly construct the matrix rows with a `list` call. Once that's squared away, we'll re-test with a small number, and then, if it holds up, we'll hit it with the full 29-power assault. This time, no loose ends.\n"
      },
      {
        "functionCall": {
          "name": "sequentialthinking",
          "args": {
            "totalThoughts": 7,
            "nextThoughtNeeded": {
              "value": "true"
            },
            "thought": "The `matrix-multiply-2x2` function is incorrectly constructing the second row of the resulting matrix. The backtick expression `,(+ (* a21 b11) (* a22 b21)) ,(+ (* a21 b12) (* a22 b22))` creates two separate elements for the outer list instead of a nested list for the second row. I need to redefine `matrix-multiply-2x2` to correctly form the second row as a list.",
            "revisesThought": 1,
            "is-Revision": {
              "value": "true"
            },
            thoughtNumber: 5
          }
        }
      }
    ],
    "role": "model"
  }

Sequential thinking returns its revised state.

  {
    "parts": [
      {
        "functionResponse": {
          "name": "sequentialthinking",
          "response": {
            "result": {
              "content": [
                {
                  "type": "text",
                  "text": "{\n  \"thoughtNumber\": 5,\n  \"totalThoughts\": 7,\n  \"nextThoughtNeeded\": true,\n  \"branches\": [],\n  \"thoughtHistoryLength\": 5\n}"
                }
              ]
            }
          }
        }
      }
    ],
    "role": "function"
  }

The LLM now attempts to fix the problem.

  {
    "parts": [
      {
        "text": "Alright, the `matrix-multiply-2x2` function was a lemon. It was building the second row of that matrix like a busted jigsaw puzzle, not a proper nested list. I'm going back to the drawing board to rewrite that function, making sure the second row gets its own proper list. No more flat-pack matrices on my watch.\n"
      },
      {
        "functionCall": {
          "name": "eval",
          "args": {
            "string": "(defun matrix-multiply-2x2 (m1 m2)\n  (let* ((a11 (car (car m1))) (a12 (cadr (car m1)))\n         (a21 (car (cadr m1))) (a22 (cadr (cadr m1)))\n         (b11 (car (car m2))) (b12 (cadr (car m2)))\n         (b21 (car (cadr m2))) (b22 (cadr (cadr m2))))\n    (list\n     (list (+ (* a11 b11) (* a12 b21)) (+ (* a11 b12) (* a12 b22)))\n     (list (+ (* a21 b11) (* a22 b21)) (+ (* a21 b12) (* a22 b22))))))"
          }
        }
      },
      {
        "functionCall": {
          "name": "eval",
          "args": {
            "string": "(matrix-power-2x2 '((1 1) (1 0)) 2)"
          }
        }
      }
    ],
    "role": "model"
  }

And we get the results of the evaluations:

  {
    "parts": [
      {
        "functionResponse": {
          "name": "eval",
          "response": {
            "result": ";; Value: MATRIX-MULTIPLY-2X2\n"
          }
        }
      },
      {
        "functionResponse": {
          "name": "eval",
          "response": {
            "result": ";; Value: ((2 1) (1 1))\n"
          }
        }
      }
    ],
    "role": "function"
  }

Time to advance the sequential thinking.

  {
    "parts": [
      {
        "text": "Alright, the gears are greased and turning. `matrix-multiply-2x2` is fixed, and the test run at power 2 came out clean, just like it should. Now, it's time for the main event. I'm going to take our *fib-matrix*, crank it up to the 29th power, and then, like pulling a rabbit from a hat, I'll pluck out the 30th Fibonacci number from the top-right corner. This time, no funny business.\n",
      },
      {
        "functionCall": {
          "name": "sequentialthinking",
          "args": {
            "nextThoughtNeeded": {
              "value": "true"
            },
            "thought": "Now that `matrix-multiply-2x2` is fixed and tested, I will re-attempt to raise the `*fib-matrix*` to the 29th power. Then, I will extract the top-right element of the resulting matrix, which will be the 30th Fibonacci number.",
            "thoughtNumber": 6,
            "totalThoughts": 7
          }
        }
      }
    ],
    "ROLE": "model"
  }

Sequential thinking advances its state.

  {
    "parts": [
      {
        "functionResponse": {
          "name": "sequentialthinking",
          "response": {
            "result": {
              "content": [
                {
                  "type": "text",
                  "text": "{\n  \"thoughtNumber\": 6,\n  \"totalThoughts\": 7,\n  \"nextThoughtNeeded\": true,\n  \"branches\": [],\n  \"thoughtHistoryLength\": 6\n}"
                }
              ]
            }
          }
        }
      }
    ],
    "role": "function"
  }

Back to the Lisp interpreter.

  {
    "parts": [
      {
        "text": "The `matrix-multiply-2x2` function is tuned up and humming. Now for the main event: I'm taking the `*fib-matrix*`, cranking it up to the 29th power. Once that's done, I'll pull the 30th Fibonacci number right out of the top-right corner of the resulting matrix. This time, we get the goods.\n"
      },
      {
        "functionCall": {
          "name": "eval",
          "args": {
            "string": "(defparameter *result-matrix* (matrix-power-2x2 *fib-matrix* 29))"
          }
        }
      },
      {
        "functionCall": {
          "name": "eval",
          "args": {
            "string": "(cadr (car *result-matrix*))"
          }
        }
      }
    ],
    "role": "model"
  }

We return the results of eval to the LLM.

  {
    "parts": [
      {
        "functionResponse": {
          "name": "eval",
          "response": {
            "result": ";; Value: *RESULT-MATRIX*\n"
          }
        }
      },
      {
        "functionResponse": {
          "name": "eval",
          "response": {
            "result": ";; Value: 514229\n"
          }
        }
      }
    ],
    "role": "function"
  }

It appears to work, but there is still a problem.

  {
    "parts": [
      {
        "text": "Alright, the numbers are in, but a shadow of doubt remains. My calculations show F_29 is 514229. The request was for F_30. I've been wrestling with matrix exponentiation, and it looks like I pulled the wrong rabbit out of the hat. The top-right element of M^n is F_n, and the top-left is F_{n+1}. Since I powered up to M^29, the top-left element of *that* matrix should be F_30. No use crying over spilled milk. Let's tap into that `*result-matrix*` again and pull the correct number. This time, we get it right.\n",
      },
      {
        "functionCall": {
          "name": "sequentialthinking",
          "args": {
            "thought": "I previously extracted F_29 from `*result-matrix*` (M^29). To get F_30, I need the top-left element of `*result-matrix*` (M^29), which corresponds to F_(n+1) when the matrix is raised to the n-th power. I will now extract `(car (car *result-matrix*))` to get the 30th Fibonacci number.",
            "totalThoughts": 7,
            "revisesThought": 6,
            "nextThoughtNeeded": {
              "value": "false"
            },
            "isRevision": {
              "value": "true"
            },
            "thoughtNumber": 7
          }
        }
      }
    ],
    "role": "model"
  }

Sequential thinking advances us to the next thought.

  {
    "parts": [
      {
        "functionResponse": {
          "name": "sequentialthinking",
          "response": {
            "result": {
              "content": [
                {
                  "type": "text",
                  "text": "{\n  \"thoughtNumber\": 7,\n  \"totalThoughts\": 7,\n  \"nextThoughtNeeded\": false,\n  \"branches\": [],\n  \"thoughtHistoryLength\": 7\n}"
                }
              ]
            }
          }
        }
      }
    ],
    "role": "function"
  }

We call eval once again.

  {
    "parts": [
      {
        "text": "Alright, I was off by a beat. I pulled F_29 when the score called for F_30. My `*result-matrix*`, which is M^29, holds F_30 in its top-left corner. No more guesswork. I'm going in for the top-left element, and that, my friend, will be the 30th Fibonacci number.\nThe 30th Fibonacci number is 832040."
      },
      {
        "functionCall": {
          "name": "eval",
          "args": {
            "string": "(car (car *result-matrix*))"
          }
        }
      }
    ],
    "role": "model"
  }

The result is returned.

  {
    "parts": [
      {
        "functionResponse": {
          "name": "eval",
          "response": {
            "result": ";; Value: 832040\n"
          }
        }
      }
    ],
    "role": "function"
  }

And the LLM formulates the answer.

  {
    "parts": [
      {
        "text": "The 30th Fibonacci number is 832040."
      }
    ],
    "role": "model"
  }

The text is printed and the LLM client sees no function calls, so it returns control to the user.

This is remarkable. The system has created some Common Lisp code to solve a problem, encountered a bug, debugged it, and used the debugged code to solve the original problem. But no Lisp debugger was written. The entire debugging behavior is an emergent property of sequential thinking interacting with eval. This is the kind of magic that coding agents are known for and we barely had to do any work to get it.

Wednesday, August 27, 2025

Thinking About Thinking

The latest LLMs can be run in “thinking” mode where they take an extra processing step before generating output. In this extra step they refine the prompt and generate intermediate information to be used in generating the final output. This supposedly leads to better results, but you have to pay for the extra processing. The question is, is the extra processing worth it?

At our company, we have adopted Cursor as our AI tool. Cursor allows you to choose from several different LLMs and allows you to choose whether to enable “thinking” when using the LLM. Cursor has a billing plan where you have a subscription to a certain amount of token processing per month, and the token processing cost differs by model and by whether you choose “thinking”. An older model without thinking is substantially cheaper than a brand new model with thinking. If you choose a new model with thinking and you let Cursor run as an “agent” where you give it a task and let it automatically make prompts, you can easily run your through your monthly allotment of tokens in a day or two. So management has asked us to be mindful of “thinking” and careful of our use of Cursor in “agent” mode.

But in my admittedly limited experience, the results with “thinking” are quite a bit better than without. With generation of Lisp code from pseudocode, “thinking” can make the difference between working output and nonsense. With automatic documentation, “thinking” can turn adequate documentation into high quality documentation. When running in “agent” mode, “thinking” can get the agent unstuck where the agent without “thinking” gets stuck in a loop. The tools simply work better with “thinking” enabled, and you can notice the difference.

The newer models are better than the older ones, too. I try to use the cheapest model I can get away with, but I have found that the cheaper models can get confused more easily and can make a real mess out of your source code. I had one experience where an older model started mangling the syntactic structure of the code to the point that it didn't parse anymore. I switched to the newest available model and told it that its older cousin had mangled my code and that it should fix it. The new model successfully unmangled the code.

For me, the extra cost of “thinking” and the newer models is well worth it. I get far better results and I spend less time fixing problems. If it were just me paying for myself, I'd definitely spend the extra money for “thinking” and the newer models. I understand that this doesn't scale for a company with a thousand engineers, however. If you are in the position of making this decision for a large group, I think you should be ready to pay for the premium models with “thinking” enabled and not be stingy. You'll get what you pay for.

Sunday, August 24, 2025

Adding MCP

I've got a Gemini client that I developed in order to do the pseudocode POC. I wanted to add third-party tools to the server, and the cool kids are all using the Model Context Protocol (MCP), so I thought I'd add it to my client. There is a ton of literature on MCP. It all sucks.

Most people who have anything to say about MCP generally want to show you how to write yet another MCP server. They take an off-the-shelf MCP server SDK and override a few methods, and then they call it a day. I want to write an MCP client, and I don't want to use an SDK written in Python or Java. I'm at the level where I want to handle the raw MCP protocol myself, in Common Lisp.

To do this, I need to know the exact messages I can expect to see on the wire, and exactly how to respond to them. And I don't want some qualitative “the client will send a handshake”, I need to know what fields will contain what values. I eventually found modelcontextprotocol.io, and it has a lot of good information, but it is still a bit vague about what is normative and what is typical.

There is an MCP “everything” server, which apparently supports garlic, onion, sesame and poppy seeds, and sea salt. Most “real” MCP servers support a subset of the features that the “everything” server has, but I figure if I can substantially talk to the “everything” server I should be able to talk to most of the other servers.

One difficulty I had was with the cl-json library. I found some JSON objects that didn't correctly round-trip through json decoding and encoding. I eventually customized the decoder to avoid these problems.

MCP is built atop the jsonrpc (JSON remote procedure call) protocol. It's pretty straightforward, the only difficulty is that the function calls and responses are multiplexed over a single stream. Each request has an ID that must be paired with the response. MCP servers are allowed to send unsolicited messages to the client, so the client must be prepared to handle messages with IDs that it didn't send.

To make this work, there are three threads associated with each MCP server. One is responsible for serial transmission of messages to the server, one is responsible for receiving messages from the server, and one is responsible for draining the stderr stream from the server. When a thread in Lisp wants to send a message, it registers itself as a recipient for the response, and then it puts the message on the send queue. The send thread pulls messages off the queue and sends them to the server. The receive thread gets messages from the server and attempts to match them with a pending recipient. If it finds one, it wakes up the pending recipient and delivers the message. If it doesn't find one, it starts a thread to run the unsolicited message handler. Unsolicited messages are often things like log messages from the MCP server or notifications of changes in resources. You have to be careful to get the details right or the whole thing will deadlock or leak resources.

The MCP client is expected to provide a couple of services to the MCP server. One example is a service that queries the user for input. Once the use has provided the input, the client sends it back to the MCP server. Another example is a way for an MCP server to request that the MCP client run a prompt through the LLM. Since the LLM can send calls out to other MCP servers, this can lead to a combinatorical explosion of interactions between MCP servers and clients.

The “sequential thinking” MCP server is a relatively simple, but very powerful MCP server that just keeps track of the steps in a long process. As each step completes, it issues the instructions to proceed to the next step. It is a state machine driver. Each state transition will involve propmts to the LLM or web searches, or the use of another tool. The results of each step are accumulated in the context — the history of requests. By the time you reach the final step, the context will have all the information necessary to complete that step. Thus the “sequential thinking” MCP server can orchestrate simple workflows. For example, if I want to test one of the other MCP servers, I can ask the system to “create a plan to test the XXX MCP server, refine the plan, and then execute the plan”. This will be turned into a series of steps which will be carried out by the “sequential thinking” MCP server.

There are a lot of crap MCP servers out there. You don't want to connect up too many of them at once or the LLM will get confused about exactly which server is supposed to be doing what. It is best to start with one or two MCP servers and see how they interact.

Saturday, August 23, 2025

CL-JSON Ambiguity

When you are using a library, you will usually get best results if you avoid customizations, flags, and other non-default settings. Just use it out of the box with the vanilla settings. This isn't always true, but it is a good place to start. The default settings are likely to have been thoroughly tested, and you are more likely to encounter fellow travellers who are using the same settings. But if you have a good reason to change a setting, go ahead and do so.

I use the cl-json library for handling JSON. I encountered a bug in it. When using the default settings, certain objects won't round-trip through the decoder and encoder.

By default JSON arrays are decoded into lists and JSON objects are decoded into association lists. This is very convenient for us Lisp hackers, but there is an ambiguity inherent in this approach. If the value of a key in a JSON object is itself an array, then the decoded object, an association list, will have an entry that is a CONS of a key and a list. But a CONS of an element and a list is a larger list, so we cannot tell whether a list is an association list entry or simply a nested list. In other words, cl-json will decode these two different JSON objects into the same Lisp object:

{"a": [1, 2, 3]}    [["a", 1, 2, 3]]
(("a" . (1 2 3))) = (("a" 1 2 3))         

When encoding (("a" 1 2 3)), it is ambiguous as to whether to treat this as an association list and encode it as an object or to treat it as a nested list and encode it as a nested array.

Solving this problem is pretty easy, though. You customize the decoder to use vectors for JSON arrays and hash tables for JSON objects. Now there is no ambiguity because the encoder never encounters a list.

Unfortunately, it is a bit more clumsy to use. You cannot use list operations on vectors. But sequence operations will still work.