Sunday, August 24, 2025

Adding MCP

I've got a Gemini client that I developed in order to do the pseudocode POC. I wanted to add third-party tools to the server, and the cool kids are all using the Model Context Protocol (MCP), so I thought I'd add it to my client. There is a ton of literature on MCP. It all sucks.

Most people who have anything to say about MCP generally want to show you how to write yet another MCP server. They take an off-the-shelf MCP server SDK and override a few methods, and then they call it a day. I want to write an MCP client, and I don't want to use an SDK written in Python or Java. I'm at the level where I want to handle the raw MCP protocol myself, in Common Lisp.

To do this, I need to know the exact messages I can expect to see on the wire, and exactly how to respond to them. And I don't want some qualitative “the client will send a handshake”, I need to know what fields will contain what values. I eventually found modelcontextprotocol.io, and it has a lot of good information, but it is still a bit vague about what is normative and what is typical.

There is an MCP “everything” server, which apparently supports garlic, onion, sesame and poppy seeds, and sea salt. Most “real” MCP servers support a subset of the features that the “everything” server has, but I figure if I can substantially talk to the “everything” server I should be able to talk to most of the other servers.

One difficulty I had was with the cl-json library. I found some JSON objects that didn't correctly round-trip through json decoding and encoding. I eventually customized the decoder to avoid these problems.

MCP is built atop the jsonrpc (JSON remote procedure call) protocol. It's pretty straightforward, the only difficulty is that the function calls and responses are multiplexed over a single stream. Each request has an ID that must be paired with the response. MCP servers are allowed to send unsolicited messages to the client, so the client must be prepared to handle messages with IDs that it didn't send.

To make this work, there are three threads associated with each MCP server. One is responsible for serial transmission of messages to the server, one is responsible for receiving messages from the server, and one is responsible for draining the stderr stream from the server. When a thread in Lisp wants to send a message, it registers itself as a recipient for the response, and then it puts the message on the send queue. The send thread pulls messages off the queue and sends them to the server. The receive thread gets messages from the server and attempts to match them with a pending recipient. If it finds one, it wakes up the pending recipient and delivers the message. If it doesn't find one, it starts a thread to run the unsolicited message handler. Unsolicited messages are often things like log messages from the MCP server or notifications of changes in resources. You have to be careful to get the details right or the whole thing will deadlock or leak resources.

The MCP client is expected to provide a couple of services to the MCP server. One example is a service that queries the user for input. Once the use has provided the input, the client sends it back to the MCP server. Another example is a way for an MCP server to request that the MCP client run a prompt through the LLM. Since the LLM can send calls out to other MCP servers, this can lead to a combinatorical explosion of interactions between MCP servers and clients.

The “sequential thinking” MCP server is a relatively simple, but very powerful MCP server that just keeps track of the steps in a long process. As each step completes, it issues the instructions to proceed to the next step. It is a state machine driver. Each state transition will involve propmts to the LLM or web searches, or the use of another tool. The results of each step are accumulated in the context — the history of requests. By the time you reach the final step, the context will have all the information necessary to complete that step. Thus the “sequential thinking” MCP server can orchestrate simple workflows. For example, if I want to test one of the other MCP servers, I can ask the system to “create a plan to test the XXX MCP server, refine the plan, and then execute the plan”. This will be turned into a series of steps which will be carried out by the “sequential thinking” MCP server.

There are a lot of crap MCP servers out there. You don't want to connect up too many of them at once or the LLM will get confused about exactly which server is supposed to be doing what. It is best to start with one or two MCP servers and see how they interact.

No comments: