Saturday, August 23, 2025

CL-JSON Ambiguity

When you are using a library, you will usually get best results if you avoid customizations, flags, and other non-default settings. Just use it out of the box with the vanilla settings. This isn't always true, but it is a good place to start. The default settings are likely to have been thoroughly tested, and you are more likely to encounter fellow travellers who are using the same settings. But if you have a good reason to change a setting, go ahead and do so.

I use the cl-json library for handling JSON. I encountered a bug in it. When using the default settings, certain objects won't round-trip through the decoder and encoder.

By default JSON arrays are decoded into lists and JSON objects are decoded into association lists. This is very convenient for us Lisp hackers, but there is an ambiguity inherent in this approach. If the value of a key in a JSON object is itself an array, then the decoded object, an association list, will have an entry that is a CONS of a key and a list. But a CONS of an element and a list is a larger list, so we cannot tell whether a list is an association list entry or simply a nested list. In other words, cl-json will decode these two different JSON objects into the same Lisp object:

{"a": [1, 2, 3]}    [["a", 1, 2, 3]]
(("a" . (1 2 3))) = (("a" 1 2 3))         

When encoding (("a" 1 2 3)), it is ambiguous as to whether to treat this as an association list and encode it as an object or to treat it as a nested list and encode it as a nested array.

Solving this problem is pretty easy, though. You customize the decoder to use vectors for JSON arrays and hash tables for JSON objects. Now there is no ambiguity because the encoder never encounters a list.

Unfortunately, it is a bit more clumsy to use. You cannot use list operations on vectors. But sequence operations will still work.

No comments: