Monday, September 1, 2025

Lisp Still Matters

Lisp is not the most popular computer language, yet it continues to attract a small but devoted following. Sure, you can find nutters for any language, but a lot of Lisp programmers are high power expert programmers. Lisp was developed by the world's top researchers and was a language of choice at places like MIT, Stanford, Carnegie-Mellon, Xerox PARC and Bell Labs. It may be a niche, but your companions in the niche are some of the best programmers in the world. What is it about Lisp that attracts the top talent?

Lisp is a powerful language because it can be easily extended to express new ideas that were not foreseen when the language was originally designed. The original designers were well aware that their new computer language couldn't take into account ideas that would be developed in the future, so they made it easy to extend the language beyond its original design. Lisp was based on Church's lambda calculus, which is a logical formalism designed to allow you to reason about logical formalism itself. When you extend the language, you can use the base language to reason about the extension.

The fundamental building block of lambda calculus is the function. In his Conversions of the Lambda Calculi, Church shows how to bootstrap the rest of mathematics from functions. With Lisp, McCarthy showed the isomorphism between s-expressions and lambda expressions, and developed a universal function within Lisp. Thus if you extend Lisp, you can use Lisp to implement and interpret your extension from within Lisp. You don't have to exit the language to extend it.

Lisp syntax, such as it is, is based upon s-expressions, which are tree structures. Lisp allows you to invent your own nodes in the tree and assign your own semantics to them. You can express the new semantics via function calls, which Church showed to be universal. You don't need to hack the parser or the compiler to extend the language, a simple defmacro will extend the syntax and a handful of defuns will assign semantics to the new syntax. If you are so inclined, you can use compiler macros to generate efficient code for your new constructs.

With Lisp, I have the option of expressing the solution to my problems in Lisp, or the option to extend Lisp to express the solution more naturally. I can tailor my problem to be solved in Lisp, or I can tailor Lisp to solve my problem. Or work from both ends to meet in the middle.

It works. Lately I've been using a forty year old dialect of a sixty year old language to interact with cutting edge LLMs. I've not only built an API to the LLM, but I've extended Lisp to include features implemented by the LLM. This was certainly not forseen by McCarthy et al when Lisp was first designed. I decided to use Lisp to explore the integration because I knew that Lisp was amenable to extension. I could have used another language, but I could not have whipped up a prototype or proof of concept by myself over a couple of weekends.

In conclusion, Lisp's enduring appeal to expert programmers lies in its powerful, foundational design. By allowing a programmer to extend the language from within, Lisp provides a unique environment for tackling novel problems and developing new abstractions. The principles of lambda calculus and the flexibility of s-expressions empower developers to not only solve problems but also to build the perfect tool for the job. This capacity to adapt and evolve, as demonstrated by its modern use with LLMs, ensures that Lisp remains a relevant and potent language for those who wish to go beyond the confines of a predefined system.

Saturday, August 30, 2025

Synergy of Tools

Once you have the sequential thinking MCP working, it can interact with the eval tool and reason about Lisp programs. I tested this by asking the LLM to determine the 30th Fibonacci number via the technique of raising a matrix to a power. The LLM was given no special instructions about how to do this or how to use Lisp. It was only told that it had access to a Common Lisp environment. The LLM was able to reason through this problem and even correct its own mistakes. The following is an annotated transcript of the process.

Now we test something hard. I want you to compute the 30th fibonacci number by raising a matrix to the 30th power.

We wrap this prompt in a content object with a "user" role, append it to the context, and send it to the LLM. Every time we interact with the LLM, we send all the accumulated context. The LLM will respond with a content object as well and we also accumulate that. The LLM itself is stateless. All the state is in this accumulated context.

I'm not going to replicate the entire accumulating context here. Instead, I will just show the most recent content object accumulated onto the context.

{"parts":
   [
    {"text": "Now we test something hard.  I want you to compute the 30th
              fibonacci number by raising a matrix to the 30th power."
    }
  ],
  "role": "user"
}

The LLM decides that five steps of sequential thinking is called for. It returns this content object.

  {
    "parts": [
      {
        "functionCall": {
          "name": "sequentialthinking",
          "args": {
            "thoughtNumber": 1,
            "thought": "First, I need to define a Common Lisp function to multiply
            two 2x2 matrices. I'll represent matrices as a list of two lists, like
            `((a b) (c d))`. The function will take two such matrices and return their product.",
            "nextThoughtNeeded": {
              "value": "true"
            },
            "totalThoughts": 5
          }
        }
      }
    ],
    "role": "model"
  }

The LLM client recognizes the "functionCall" object, so instead of returning control to the user, it tail calls the sequentialThinking tool via the MCP protocol. The sequentialThinking tool returns a result that we wrap in a response object. We append this response object to the context and send it back to the LLM.

  {
    "parts": [
      {
        "functionResponse": {
          "name": "sequentialthinking",
          "response": {
            "result": {
              "content": [
                {
                  "type": "text",
                  "text": "{\n  \"thoughtNumber\": 1,\n  \"totalThoughts\": 5,\n  \"nextThoughtNeeded\": true,\n  \"branches\": [],\n  \"thoughtHistoryLength\": 1\n}"
                }
              ]
            }
          }
        }
      }
    ],
    "role": "function"
  }

The LLM responds with a multi-part response.

{
    "parts": [
      {
        "text": "Alright, pal, we're diving into the deep end. First off, I'm gonna lay down some Lisp code to handle the grunt work of multiplying two 2x2 matrices. We'll represent 'em as lists of lists. This ain't no parlor trick; it's the foundation for what's coming next.\n"
      },
      {
        "functionCall": {
          "name": "eval",
          "args": {
            "string": "(defun matrix-multiply-2x2 (m1 m2)\n  (let* ((a11 (car (car m1))) (a12 (cadr (car m1)))\n         (a21 (car (cadr m1))) (a22 (cadr (cadr m1)))\n         (b11 (car (car m2))) (b12 (cadr (car m2)))\n         (b21 (car (cadr m2))) (b22 (cadr (cadr m2))))\n    `((,(+ (* a11 b11) (* a12 b21)) ,(+ (* a11 b12) (* a12 b22)))\n      ,(+ (* a21 b11) (* a22 b21)) ,(+ (* a21 b12) (* a22 b22)))))"
          }
        }
      }
    ],
    "role": "model"
  }

The LLM client prints the text part, but it also recognizes the function call part and tail calls the eval tool. The eval tool returns a "functionResponse" object that we append to the context and pass along to the LLM.

  {
    "parts": [
      {
        "functionResponse": {
          "name": "eval",
          "response": {
            "result": ";; Value: MATRIX-MULTIPLY-2X2\n"
          }
        }
      }
    ],
    "role": "function"
  }

The LLM realizes it is time for the next step in its sequential thinking. It returns another multi-part response.

  {
    "parts": [
      {
        "text": "The matrix multiplication function is in place. Next, we'll build the engine for raising that matrix to a power. This is where the real muscle comes in, using a trick called exponentiation by squaring to keep things snappy.\n"
      },
      {
        "functionCall": {
          "name": "sequentialthinking",
          "args": {
            "totalThoughts": 5,
            "thought": "Next, I need to define a Lisp function for matrix exponentiation using the \"exponentiation by squaring\" algorithm. This will take a 2x2 matrix and a non-negative integer power.",
            "nextThoughtNeeded": {
              "value": "true"
            },
            "thoughtNumber": 2
          }
        }
      }
    ],
    "role": "model"
  },

Sequential thinking advances to the next thought.

  {
    "parts": [
      {
        "functionResponse": {
          "name": "sequentialthinking",
          "response": {
            "result": {
              "content": [
                {
                  "type": "text",
                  "text": "{\n  \"thoughtNumber\": 2,\n  \"totalThoughts\": 5,\n  \"nextThoughtNeeded\": true,\n  \"branches\": [],\n  \"thoughtHistoryLength\": 2\n}"
                }
              ]
            }
          }
        }
      }
    ],
    "role": "function"
  }

The LLM responds with a request to call to eval.

  {
    "parts": [
      {
        "text": "Alright, pal, the multiplication engine is purring. Now we build the big gun: a Lisp function to raise that 2x2 matrix to any power you throw at it, using the \"exponentiation by squaring\" trick. This ain't no slow boat to China; we're gonna get there fast.\n"
      },
      {
        "functionCall": {
          "name": "eval",
          "args": {
            "string": "(defun matrix-power-2x2 (m n)\n  (cond\n    ((zerop n) '((1 0) (0 1))) ; Identity matrix for n=0\n    ((= n 1) m)\n    (t\n     (let* ((half-power (matrix-power-2x2 m (floor n 2)))\n            (squared (matrix-multiply-2x2 half-power half-power)))\n       (if (oddp n)\n           (matrix-multiply-2x2 m squared)\n           squared)))))"
          }
        }
      }
    ],
    "role": "model"
  }

The LLM client tail calls the eval tool again. The eval tool returns a "functionResponse" object that we append to the context and pass along to the LLM.

  {
    "parts": [
      {
        "functionResponse": {
          "name": "eval",
          "response": {
            "result": ";; Value: MATRIX-POWER-2X2\n"
          }
        }
      }
    ],
    "role": "function"
  }

The LLM advances to the next thought in its sequential thinking.

  {
    "parts": [
      {
        "text": "Alright, the power-hitter is in the lineup. Now, we're setting up the initial play: the base Fibonacci matrix. This little beauty is the key to unlocking the whole operation.\n"
      },
      {
        "functionCall": {
          "name": "sequentialthinking",
          "args": {
            "thoughtNumber": 3,
            "totalThoughts": 5,
            "nextThoughtNeeded": {
              "value": "true"
            },
            "thought": "Now I will define the base Fibonacci matrix: `((1 1) (1 0))`. Then, I will raise this matrix to the 29th power, because the top-right element of M^n gives F_n."
          }
        }
      }
    ],
    "role": "model"
  }

Once again we call sequential thinking to advance to the next thought.

  {
    "parts": [
      {
        "functionResponse": {
          "name": "sequentialthinking",
          "response": {
            "result": {
              "content": [
                {
                  "type": "text",
                  "text": "{\n  \"thoughtNumber\": 3,\n  \"totalThoughts\": 5,\n  \"nextThoughtNeeded\": true,\n  \"branches\": [],\n  \"thoughtHistoryLength\": 3\n}"
                }
              ]
            }
          }
        }
      }
    ],
    "role": "function"
  }

And once again we get a multi-part response. This response has two functionCall parts to call eval

    "parts": [
      {
        "text": "The power-hitter's ready. Now, we're setting up the initial play: the base Fibonacci matrix. This little beauty is the key to unlocking the whole operation. We'll crank it up to the 29th power, and the answer to our little puzzle will be staring us right in the face in the top-right corner.\n"
      },
      {
        "functionCall": {
          "name": "eval",
          "args": {
            "string": "(defparameter *fib-matrix* '((1 1) (1 0)))"
          }
        }
      },
      {
        "functionCall": {
          "name": "eval",
          "args": {
            "string": "(defparameter *result-matrix* (matrix-power-2x2 *fib-matrix* 29))"
          }
        }
      }
    ],
    "role": "model"
  }

The result of evaluation is as follows:

  {
    "parts": [
      {
        "functionResponse": {
          "name": "eval",
          "response": {
            "result": ";; Value: *FIB-MATRIX*\n"
          }
        }
      },
      {
        "functionResponse": {
          "name": "eval",
          "response": {
            "result": "Error: Expression could not be evaluated."
          }
        }
      }
    ],
    "role": "function"
  }

We have a problem. The second evaluation errored out. The LLM notices this and revises its sequential thinking state.

  {
    "parts": [
      {
        "text": "Alright, a snag in the wire. Looks like my attempt to power up the Fibonacci matrix hit a wall. The Lisp interpreter choked on it. This ain't right. Time to put on my magnifying glass and check the details. First, I'm gonna make sure that base matrix is even on the books, and then I'll kick the tires on that `matrix-power-2x2` function with some small numbers. We'll find out what's causing this hiccup.\n"
      },
      {
        "functionCall": {
          "name": "sequentialthinking",
          "args": {
            "isRevision": {
              "value": "true"
            },
            "thoughtNumber": 4,
            "totalThoughts": 6,
            "revisesThought": 3,
            "nextThoughtNeeded": {
              "value": "true"
            },
            "thought": "The previous eval failed. I need to debug why `(matrix-power-2x2 *fib-matrix* 29)` failed. First, I'll confirm if `*fib-matrix*` is correctly defined and then test `matrix-power-2x2` with a smaller, known input to ensure it works."
          }
        }
      }
    ],
    "role": "model"
  }

Sequential thinking returns its state.

  {
    "parts": [
      {
        "functionResponse": {
          "name": "sequentialthinking",
          "response": {
            "result": {
              "content": [
                {
                  "type": "text",
                  "text": "{\n  \"thoughtNumber\": 4,\n  \"totalThoughts\": 6,\n  \"nextThoughtNeeded\": true,\n  \"branches\": [],\n  \"thoughtHistoryLength\": 4\n}"
                }
              ]
            }
          }
        }
      }
    ],
    "role": "function"
  }

The LLM decides to debug the problem.

  {
    "parts": [
      {
        "text": "Alright, the last job went sideways. The Lisp interpreter choked when I tried to crank that matrix to the 29th power. We're on a revision here, back to the drawing board. First, I'm gonna make damn sure that `*fib-matrix*` is what it's supposed to be. Then, I'll put `matrix-power-2x2` through its paces with a smaller, simpler test. We'll find out what piece of the puzzle went missing.\n"
      },
      {
        "functionCall": {
          "name": "boundp",
          "args": {
            "symbol": "*FIB-MATRIX*"
          }
        }
      },
      {
        "functionCall": {
          "name": "printSymbolValue",
          "args": {
            "symbol": "*FIB-MATRIX*"
          }
        }
      },
      {
        "functionCall": {
          "name": "eval",
          "args": {
            "string": "(matrix-power-2x2 '((1 1) (1 0)) 2)"
          }
        }
      }
    ],
    "role": "model"
  }

The LLM uses the Lisp introspection tools we provided.

  {
    "parts": [
      {
        "functionResponse": {
          "name": "boundp",
          "response": {
            "result": {
              "value": "true"
            }
          }
        }
      },
      {
        "functionResponse": {
          "name": "printSymbolValue",
          "response": {
            "result": "((1 1) (1 0))"
          }
        }
      },
      {
        "functionResponse": {
          "name": "eval",
          "response": {
            "result": ";; Value: ((2 1) 1 1)\n"
          }
        }
      }
    ],
    "role": "function"
  }

Clearly the response for the third evaluation is an improperly formed matrix. The LLM has noticed this as well.

  {
    "parts": [
      {
        "text": "Alright, the blueprints were faulty. My `matrix-multiply-2x2` function had a screw loose, building the second row of the matrix all wrong. It was spitting out a flat list instead of a nested one, making the whole operation fall apart. I'm going back in, fixing that Lisp code to properly construct the matrix rows with a `list` call. Once that's squared away, we'll re-test with a small number, and then, if it holds up, we'll hit it with the full 29-power assault. This time, no loose ends.\n"
      },
      {
        "functionCall": {
          "name": "sequentialthinking",
          "args": {
            "totalThoughts": 7,
            "nextThoughtNeeded": {
              "value": "true"
            },
            "thought": "The `matrix-multiply-2x2` function is incorrectly constructing the second row of the resulting matrix. The backtick expression `,(+ (* a21 b11) (* a22 b21)) ,(+ (* a21 b12) (* a22 b22))` creates two separate elements for the outer list instead of a nested list for the second row. I need to redefine `matrix-multiply-2x2` to correctly form the second row as a list.",
            "revisesThought": 1,
            "is-Revision": {
              "value": "true"
            },
            thoughtNumber: 5
          }
        }
      }
    ],
    "role": "model"
  }

Sequential thinking returns its revised state.

  {
    "parts": [
      {
        "functionResponse": {
          "name": "sequentialthinking",
          "response": {
            "result": {
              "content": [
                {
                  "type": "text",
                  "text": "{\n  \"thoughtNumber\": 5,\n  \"totalThoughts\": 7,\n  \"nextThoughtNeeded\": true,\n  \"branches\": [],\n  \"thoughtHistoryLength\": 5\n}"
                }
              ]
            }
          }
        }
      }
    ],
    "role": "function"
  }

The LLM now attempts to fix the problem.

  {
    "parts": [
      {
        "text": "Alright, the `matrix-multiply-2x2` function was a lemon. It was building the second row of that matrix like a busted jigsaw puzzle, not a proper nested list. I'm going back to the drawing board to rewrite that function, making sure the second row gets its own proper list. No more flat-pack matrices on my watch.\n"
      },
      {
        "functionCall": {
          "name": "eval",
          "args": {
            "string": "(defun matrix-multiply-2x2 (m1 m2)\n  (let* ((a11 (car (car m1))) (a12 (cadr (car m1)))\n         (a21 (car (cadr m1))) (a22 (cadr (cadr m1)))\n         (b11 (car (car m2))) (b12 (cadr (car m2)))\n         (b21 (car (cadr m2))) (b22 (cadr (cadr m2))))\n    (list\n     (list (+ (* a11 b11) (* a12 b21)) (+ (* a11 b12) (* a12 b22)))\n     (list (+ (* a21 b11) (* a22 b21)) (+ (* a21 b12) (* a22 b22))))))"
          }
        }
      },
      {
        "functionCall": {
          "name": "eval",
          "args": {
            "string": "(matrix-power-2x2 '((1 1) (1 0)) 2)"
          }
        }
      }
    ],
    "role": "model"
  }

And we get the results of the evaluations:

  {
    "parts": [
      {
        "functionResponse": {
          "name": "eval",
          "response": {
            "result": ";; Value: MATRIX-MULTIPLY-2X2\n"
          }
        }
      },
      {
        "functionResponse": {
          "name": "eval",
          "response": {
            "result": ";; Value: ((2 1) (1 1))\n"
          }
        }
      }
    ],
    "role": "function"
  }

Time to advance the sequential thinking.

  {
    "parts": [
      {
        "text": "Alright, the gears are greased and turning. `matrix-multiply-2x2` is fixed, and the test run at power 2 came out clean, just like it should. Now, it's time for the main event. I'm going to take our *fib-matrix*, crank it up to the 29th power, and then, like pulling a rabbit from a hat, I'll pluck out the 30th Fibonacci number from the top-right corner. This time, no funny business.\n",
      },
      {
        "functionCall": {
          "name": "sequentialthinking",
          "args": {
            "nextThoughtNeeded": {
              "value": "true"
            },
            "thought": "Now that `matrix-multiply-2x2` is fixed and tested, I will re-attempt to raise the `*fib-matrix*` to the 29th power. Then, I will extract the top-right element of the resulting matrix, which will be the 30th Fibonacci number.",
            "thoughtNumber": 6,
            "totalThoughts": 7
          }
        }
      }
    ],
    "ROLE": "model"
  }

Sequential thinking advances its state.

  {
    "parts": [
      {
        "functionResponse": {
          "name": "sequentialthinking",
          "response": {
            "result": {
              "content": [
                {
                  "type": "text",
                  "text": "{\n  \"thoughtNumber\": 6,\n  \"totalThoughts\": 7,\n  \"nextThoughtNeeded\": true,\n  \"branches\": [],\n  \"thoughtHistoryLength\": 6\n}"
                }
              ]
            }
          }
        }
      }
    ],
    "role": "function"
  }

Back to the Lisp interpreter.

  {
    "parts": [
      {
        "text": "The `matrix-multiply-2x2` function is tuned up and humming. Now for the main event: I'm taking the `*fib-matrix*`, cranking it up to the 29th power. Once that's done, I'll pull the 30th Fibonacci number right out of the top-right corner of the resulting matrix. This time, we get the goods.\n"
      },
      {
        "functionCall": {
          "name": "eval",
          "args": {
            "string": "(defparameter *result-matrix* (matrix-power-2x2 *fib-matrix* 29))"
          }
        }
      },
      {
        "functionCall": {
          "name": "eval",
          "args": {
            "string": "(cadr (car *result-matrix*))"
          }
        }
      }
    ],
    "role": "model"
  }

We return the results of eval to the LLM.

  {
    "parts": [
      {
        "functionResponse": {
          "name": "eval",
          "response": {
            "result": ";; Value: *RESULT-MATRIX*\n"
          }
        }
      },
      {
        "functionResponse": {
          "name": "eval",
          "response": {
            "result": ";; Value: 514229\n"
          }
        }
      }
    ],
    "role": "function"
  }

It appears to work, but there is still a problem.

  {
    "parts": [
      {
        "text": "Alright, the numbers are in, but a shadow of doubt remains. My calculations show F_29 is 514229. The request was for F_30. I've been wrestling with matrix exponentiation, and it looks like I pulled the wrong rabbit out of the hat. The top-right element of M^n is F_n, and the top-left is F_{n+1}. Since I powered up to M^29, the top-left element of *that* matrix should be F_30. No use crying over spilled milk. Let's tap into that `*result-matrix*` again and pull the correct number. This time, we get it right.\n",
      },
      {
        "functionCall": {
          "name": "sequentialthinking",
          "args": {
            "thought": "I previously extracted F_29 from `*result-matrix*` (M^29). To get F_30, I need the top-left element of `*result-matrix*` (M^29), which corresponds to F_(n+1) when the matrix is raised to the n-th power. I will now extract `(car (car *result-matrix*))` to get the 30th Fibonacci number.",
            "totalThoughts": 7,
            "revisesThought": 6,
            "nextThoughtNeeded": {
              "value": "false"
            },
            "isRevision": {
              "value": "true"
            },
            "thoughtNumber": 7
          }
        }
      }
    ],
    "role": "model"
  }

Sequential thinking advances us to the next thought.

  {
    "parts": [
      {
        "functionResponse": {
          "name": "sequentialthinking",
          "response": {
            "result": {
              "content": [
                {
                  "type": "text",
                  "text": "{\n  \"thoughtNumber\": 7,\n  \"totalThoughts\": 7,\n  \"nextThoughtNeeded\": false,\n  \"branches\": [],\n  \"thoughtHistoryLength\": 7\n}"
                }
              ]
            }
          }
        }
      }
    ],
    "role": "function"
  }

We call eval once again.

  {
    "parts": [
      {
        "text": "Alright, I was off by a beat. I pulled F_29 when the score called for F_30. My `*result-matrix*`, which is M^29, holds F_30 in its top-left corner. No more guesswork. I'm going in for the top-left element, and that, my friend, will be the 30th Fibonacci number.\nThe 30th Fibonacci number is 832040."
      },
      {
        "functionCall": {
          "name": "eval",
          "args": {
            "string": "(car (car *result-matrix*))"
          }
        }
      }
    ],
    "role": "model"
  }

The result is returned.

  {
    "parts": [
      {
        "functionResponse": {
          "name": "eval",
          "response": {
            "result": ";; Value: 832040\n"
          }
        }
      }
    ],
    "role": "function"
  }

And the LLM formulates the answer.

  {
    "parts": [
      {
        "text": "The 30th Fibonacci number is 832040."
      }
    ],
    "role": "model"
  }

The text is printed and the LLM client sees no function calls, so it returns control to the user.

This is remarkable. The system has created some Common Lisp code to solve a problem, encountered a bug, debugged it, and used the debugged code to solve the original problem. But no Lisp debugger was written. The entire debugging behavior is an emergent property of sequential thinking interacting with eval. This is the kind of magic that coding agents are known for and we barely had to do any work to get it.

Wednesday, August 27, 2025

Thinking About Thinking

The latest LLMs can be run in “thinking” mode where they take an extra processing step before generating output. In this extra step they refine the prompt and generate intermediate information to be used in generating the final output. This supposedly leads to better results, but you have to pay for the extra processing. The question is, is the extra processing worth it?

At our company, we have adopted Cursor as our AI tool. Cursor allows you to choose from several different LLMs and allows you to choose whether to enable “thinking” when using the LLM. Cursor has a billing plan where you have a subscription to a certain amount of token processing per month, and the token processing cost differs by model and by whether you choose “thinking”. An older model without thinking is substantially cheaper than a brand new model with thinking. If you choose a new model with thinking and you let Cursor run as an “agent” where you give it a task and let it automatically make prompts, you can easily run your through your monthly allotment of tokens in a day or two. So management has asked us to be mindful of “thinking” and careful of our use of Cursor in “agent” mode.

But in my admittedly limited experience, the results with “thinking” are quite a bit better than without. With generation of Lisp code from pseudocode, “thinking” can make the difference between working output and nonsense. With automatic documentation, “thinking” can turn adequate documentation into high quality documentation. When running in “agent” mode, “thinking” can get the agent unstuck where the agent without “thinking” gets stuck in a loop. The tools simply work better with “thinking” enabled, and you can notice the difference.

The newer models are better than the older ones, too. I try to use the cheapest model I can get away with, but I have found that the cheaper models can get confused more easily and can make a real mess out of your source code. I had one experience where an older model started mangling the syntactic structure of the code to the point that it didn't parse anymore. I switched to the newest available model and told it that its older cousin had mangled my code and that it should fix it. The new model successfully unmangled the code.

For me, the extra cost of “thinking” and the newer models is well worth it. I get far better results and I spend less time fixing problems. If it were just me paying for myself, I'd definitely spend the extra money for “thinking” and the newer models. I understand that this doesn't scale for a company with a thousand engineers, however. If you are in the position of making this decision for a large group, I think you should be ready to pay for the premium models with “thinking” enabled and not be stingy. You'll get what you pay for.

Sunday, August 24, 2025

Adding MCP

I've got a Gemini client that I developed in order to do the pseudocode POC. I wanted to add third-party tools to the server, and the cool kids are all using the Model Context Protocol (MCP), so I thought I'd add it to my client. There is a ton of literature on MCP. It all sucks.

Most people who have anything to say about MCP generally want to show you how to write yet another MCP server. They take an off-the-shelf MCP server SDK and override a few methods, and then they call it a day. I want to write an MCP client, and I don't want to use an SDK written in Python or Java. I'm at the level where I want to handle the raw MCP protocol myself, in Common Lisp.

To do this, I need to know the exact messages I can expect to see on the wire, and exactly how to respond to them. And I don't want some qualitative “the client will send a handshake”, I need to know what fields will contain what values. I eventually found modelcontextprotocol.io, and it has a lot of good information, but it is still a bit vague about what is normative and what is typical.

There is an MCP “everything” server, which apparently supports garlic, onion, sesame and poppy seeds, and sea salt. Most “real” MCP servers support a subset of the features that the “everything” server has, but I figure if I can substantially talk to the “everything” server I should be able to talk to most of the other servers.

One difficulty I had was with the cl-json library. I found some JSON objects that didn't correctly round-trip through json decoding and encoding. I eventually customized the decoder to avoid these problems.

MCP is built atop the jsonrpc (JSON remote procedure call) protocol. It's pretty straightforward, the only difficulty is that the function calls and responses are multiplexed over a single stream. Each request has an ID that must be paired with the response. MCP servers are allowed to send unsolicited messages to the client, so the client must be prepared to handle messages with IDs that it didn't send.

To make this work, there are three threads associated with each MCP server. One is responsible for serial transmission of messages to the server, one is responsible for receiving messages from the server, and one is responsible for draining the stderr stream from the server. When a thread in Lisp wants to send a message, it registers itself as a recipient for the response, and then it puts the message on the send queue. The send thread pulls messages off the queue and sends them to the server. The receive thread gets messages from the server and attempts to match them with a pending recipient. If it finds one, it wakes up the pending recipient and delivers the message. If it doesn't find one, it starts a thread to run the unsolicited message handler. Unsolicited messages are often things like log messages from the MCP server or notifications of changes in resources. You have to be careful to get the details right or the whole thing will deadlock or leak resources.

The MCP client is expected to provide a couple of services to the MCP server. One example is a service that queries the user for input. Once the use has provided the input, the client sends it back to the MCP server. Another example is a way for an MCP server to request that the MCP client run a prompt through the LLM. Since the LLM can send calls out to other MCP servers, this can lead to a combinatorical explosion of interactions between MCP servers and clients.

The “sequential thinking” MCP server is a relatively simple, but very powerful MCP server that just keeps track of the steps in a long process. As each step completes, it issues the instructions to proceed to the next step. It is a state machine driver. Each state transition will involve propmts to the LLM or web searches, or the use of another tool. The results of each step are accumulated in the context — the history of requests. By the time you reach the final step, the context will have all the information necessary to complete that step. Thus the “sequential thinking” MCP server can orchestrate simple workflows. For example, if I want to test one of the other MCP servers, I can ask the system to “create a plan to test the XXX MCP server, refine the plan, and then execute the plan”. This will be turned into a series of steps which will be carried out by the “sequential thinking” MCP server.

There are a lot of crap MCP servers out there. You don't want to connect up too many of them at once or the LLM will get confused about exactly which server is supposed to be doing what. It is best to start with one or two MCP servers and see how they interact.

Saturday, August 23, 2025

CL-JSON Ambiguity

When you are using a library, you will usually get best results if you avoid customizations, flags, and other non-default settings. Just use it out of the box with the vanilla settings. This isn't always true, but it is a good place to start. The default settings are likely to have been thoroughly tested, and you are more likely to encounter fellow travellers who are using the same settings. But if you have a good reason to change a setting, go ahead and do so.

I use the cl-json library for handling JSON. I encountered a bug in it. When using the default settings, certain objects won't round-trip through the decoder and encoder.

By default JSON arrays are decoded into lists and JSON objects are decoded into association lists. This is very convenient for us Lisp hackers, but there is an ambiguity inherent in this approach. If the value of a key in a JSON object is itself an array, then the decoded object, an association list, will have an entry that is a CONS of a key and a list. But a CONS of an element and a list is a larger list, so we cannot tell whether a list is an association list entry or simply a nested list. In other words, cl-json will decode these two different JSON objects into the same Lisp object:

{"a": [1, 2, 3]}    [["a", 1, 2, 3]]
(("a" . (1 2 3))) = (("a" 1 2 3))         

When encoding (("a" 1 2 3)), it is ambiguous as to whether to treat this as an association list and encode it as an object or to treat it as a nested list and encode it as a nested array.

Solving this problem is pretty easy, though. You customize the decoder to use vectors for JSON arrays and hash tables for JSON objects. Now there is no ambiguity because the encoder never encounters a list.

Unfortunately, it is a bit more clumsy to use. You cannot use list operations on vectors. But sequence operations will still work.

Saturday, August 16, 2025

Dinosaurs

What did the dinosaurs think in their twilight years as their numbers dwindled and small scurrying mammals began to challenge their dominance? Did they reminisce of the glory days when Tyrannosaurus Rex ruled the land and Pteranodon soared through the air? Probably not. They were, after all, just dumb animals.

Our company has decided to buy in to Cursor as an AI coding tool. Cursor is one of many AI coding tools that have recently been brought to market, and it is a fine tool. It is based on a fork of VSCode and has AI coding capabilities built in to it. One of the more useful ones (and one that is available in many other AI tools) is AI code completion. This anticipates what you are going to type and tries to complete it for you. It gets it right maybe 10-20% of the time if you are lucky, and not far wrong maybe 80% of the time. You can get into a flow where you reflexively keep or discard its suggestions or accept the near misses and then correct them. This turns out to be faster than typing everything yourself, once you get used to it. It isn't for everyone, but it works for me.

Our company has been using GitHub Copilot for several months now. There is an Emacs package that allows you to use the Copilot code completion in Emacs, and I have been using it for these past few months. In addition to code completion, it will complete sentences and paragraphs in text mode and html mode. I generally reject its suggestions because it doesn't phrase things the way I prefer, but I really like seeing the suggestions as I type. It offers an alternative train of thought that I can mull over. If the suggestions wildly diverge from what I am thinking, it is usually because I didn't lay the groundwork for my train of thought, so I can go back and rework my text to make it clearer. It seems to make my prose more focused.

But now comes Cursor, and it has one big problem. It is a closed proprietary tool with no API or SDK. It won't talk to Emacs. So do I abandon Emacs and jump on the Cursor bandwagon, or do I stick with Emacs and miss out on the latest AI coding tools? Is there really a question? I've been using Emacs since before my manager was born, and I am not about to give it up now. My company will continue with a few GitHub Copilot licenses for those that have a compelling reason to not switch to Cursor, and I think Emacs compatibility is pretty compelling.

But no one uses Emacs and Lisp anymore but us dinosaurs. They all have shiny new toys like Cursor and Golang. I live for the schadenfreude of watching the gen Z kids rediscover and attempt to solve the same problems that were solved fifty years ago. The same bugs, but the tools are now clumsier.

Tuesday, August 12, 2025

How Did I Possibly Break This?

It made no sense. My change to upgrade the Java Virtual Machine caused a number of our builds to stop working. But when I investigated, I found that the builds were failing in tsc, the TypeScript compiler. The TypeScript compiler isn't written Java. Java isn't involved in the tool chain. What was going on?

It turned out that someone pushed an update to a TypeScript library simultaneously (but purely coincidentally) with my Java upgrade. The code was written to use the latest library and our TypeScript compiler was past its prime. It barfed on the new library. Java was not involved in the least. It only looked causal because breakage happened right after I pushed the new image.