Wednesday, July 26, 2023

The Garden Path

Follow me along this garden path (based on true events).

We have a nifty program and we want it to be flexible, so it has a config file. We make up some sort of syntax that indicates key/value pairs. Maybe we’re hipsters and use YAML. Life is good.

But we find that we to configure something dynamically, say based on the value of an environment variable. So we add some escape syntax to the config file to indicate that a value is a variable rather than a literal. But sometimes the string needs a little work done to it, so we add some string manipulation features to the escape syntax.

And when we deploy the program, we find that we’ve want to conditionalize part of the configuration based on the deployment, so we add a conditional syntax to our config language. But conditionals are predicated on boolean values, so we add booleans to our config syntax. Or maybe we make strings do double duty. Of course we need the basic boolean operators, too.

But there’s a lot of duplication across our configurations, so we add the ability to indirectly refer to other config files. That helps to some extent, but there’s a lot of stuff that is almost duplicated, except for a little variation. So we add a way to make a configuration template. Templating needs variables and quoting, so we invent a syntax for those as well.

We’re building a computer language by accident, and without a clear plan it is going to go poorly. Are there data types (aside from strings)? Is there a coherent type system? Are the variables lexically scoped? Is it call-by-name or call-by-value? Is it recursive? Does it have first class (or even second class) procedures? Did we get nested escaping right? How about quoted nested escaping? And good grief our config language is in YAML!

If we had some forethought, we would have realized that we were designing a language and we would have put the effort into making it a good one. If we’re lazy, we’d just pick an existing good language. Like Lisp.

3 comments:

Big 40wt Svetlyak said...

That is why I decided to get rid of YAML and just use Lisp. Even recorded a video about my approach: https://www.youtube.com/watch?v=j2d5hYGVI7M

Anonymous said...

There's a relevant old quip, I think by Steele, to the effect that the corollary of "code is data" is that "data is code". The steps a program takes to ingest its config file(s) are, in effect, an interpreter for a language whose purpose is frobbing the program's state. (I'd consider argv handling to be another sort of interpreter, too.) As is generally true about language design, most config file languages are mediocre, because language design is hard! (There can be good reasons to design config file languages anyhow, e.g., when a program has complicated constraints in its configuration space. But ISTM that the tradition of inventing ad hoc configuration languages for programs stems mostly from the absence of eval in historical systems languages.)

owen said...

Yaml is where all your problems started but you have only yourself to blame for the scope creep.