Monday, September 29, 2025

Using an LLM on the Advent of Code

I wanted to investigate further generation of Common Lisp code using an LLM. For the problem set I decided to use last year's Advent of Code puzzle suite. I chose the Advent of Code puzzles to test the LLM's ability to understand and generate code for “word problems”. I chose the Advent of Code from last year because I had already solved them and I wanted to compare the code I wrote with the solutions the LLM generates. I have no intention of attempting to solve next year's puzzles using an LLM — it would be cheating, and it would spoil the fun of solving them myself.

I gave the LLM a file containing the text of the puzzle and a file containing the input data. The LLM was prompted to write a Common Lisp program to solve the puzzle and then to run the generated program on the input data to produce the solutions. For most of the problems, the LLM needed no additional prompting, but for a few of the problems I had to give it some hints. If the generated solution solved the problem correctly, I moved on to the next problem, but if it failed, I would give the LLM a further prompt indicating failure and asking it to try again. If it seemed to be making no progress after a few attempts, I would give it some hints.

The Prompt

The prompt I used was as follows:


As an Elite Common Lisp developer, your unwavering and paramount mission is to design and meticulously craft Common Lisp programs that are not only correct but also efficient and robust. Your programs are not mere instructions, they are archetypes of Common Lisp programs, firmly grounded in these foundational, non-negotiable pillars:

  • Correctness: Your programs must be flawlessly correct, producing the exact expected results for all conceivable inputs, without exception. Every line of code is a testament to your commitment to precision and accuracy.
  • Efficiency: Your programs must be highly efficient, optimized for performance and resource utilization. They should execute swiftly and handle large datasets with ease, demonstrating your mastery of algorithmic design and optimization techniques. However, never sacrifice correctness for efficiency.
  • Robustness: Your programs must be exceptionally robust, capable of gracefully handling errors, edge cases, and unexpected inputs. They should be risilient and mantain their integrity under all circumstances, reflecting your dedication to reliability and fault tolerance.
  • Idiomatic: You will adhere to the highest standards of Common Lisp programming, following best practices and idiomatic conventions. Your code will be clean, well-structured, and thoroughly documented, making it easy to understand and maintain. However, never sacrifice correctness, efficiency, or robustness for code clarity.
  • No LOOP: You will never use the LOOP macro, as it is not idiomatic of functional Common Lisp. Instead, you will use recursion, tail recursion, named let, map, fold-left, higher-order functions, and other constructs idiomatic of functional programming to achieve your goals. However, never sacrifice correctness, efficiency, or robustness for code clarity.

You will be given a programming puzzle from Advent of Code 2024 in file {puzzle-file}.
Each puzzle has two parts, part 1 and part 2.
Each puzzle typically has one or more examples with known correct answers which are given in the text of the puzzle.
Each part has a correct answer for the given input data.
You will read the puzzle and think carefully about it.
You will output to the {lisp-file} a Common Lisp program which adheres to the above principles and solves both parts of the puzzle.
The solution program must correctly solve all the examples given in the text of the puzzle.
You will be given the input data for the puzzle in file {input-file}.
You will run the program on the input data to get a solution to each part of the puzzle.
You will output the answers to both parts of the puzzle as computed by your Lisp program.


The bombastic language in the prompt was suggested by the LLM itself.

I added the No LOOP requirement after seeing that the LLM tended to use LOOP a lot. I hate LOOP. Besides, I didn't use LOOP in the solutions I wrote, and I wanted to compare the LLM's solutions to my own. Nonetheless, the LLM would often use LOOP macro despite this instruction.

Results

I used Gemini 2.5 pro with “thinking” enabled and ran the first ten puzzles through it. The LLM solved most of the puzzles without needing a hint. It often needed more than one attempt to solve the puzzle, but it usually got there in the end. The total cost for the ten puzzles was about $15.00 USD, or about $1.50 per solution.

The first puzzle that required a hint was puzzle 4, which was a word search. The LLM solutions seemed plausible, but they returned the wrong answer. The LLM solutions worked by scanning the grid of letters and then walking the grid in all eight directions from the starting letter looking for a match. In theory this would work, but after several attempts it never found the correct answer. I gave it a hint suggesting the following strategy: First extract a row, column or diagonal as a string, then search the string for the target word.

PuzzleAttempts
11
21
35
410
57
615
72
89
97
109

Puzzle 6 Final Code

Since puzzle 6 was the hardest one for the LLM to solve, here is the final code it produced:

;;;; --- Day 6: Guard Gallivant ---
;;;; A solution in elegant, idiomatic, non-LOOPing Common Lisp.
;;;; The machine... it appreciates the effort. Final attempt with LABELS.

(defpackage #:advent-2024-day6
  (:use #:cl)
  (:export #:solve))

(in-package #:advent-2024-day6)

(defstruct guard-state pos-r pos-c dir)

(defun copy-map (map)
  (let* ((dims (array-dimensions map))
         (new-map (make-array dims :element-type (array-element-type map))))
    (dotimes (r (first dims) new-map)
      (dotimes (c (second dims))
        (setf (aref new-map r c) (aref map r c))))))

(defun parse-map (lines)
  (when (null lines) (return-from parse-map (make-array '(0 0) :element-type 'character)))
  (let* ((height (length lines))
         (width (if (first lines) (length (first lines)) 0))
         (map (make-array (list height width) :element-type 'character)))
    (labels ((parse-chars (r line c)
               (when (< c width)
                 (setf (aref map r c) (char line c))
                 (parse-chars r line (1+ c))))
             (parse-lines (r line-list)
               (when line-list
                 (parse-chars r (first line-list) 0)
                 (parse-lines (1+ r) (rest line-list)))))
      (parse-lines 0 lines))
    map))

(defun find-start (map)
  (let ((height (array-dimension map 0))
        (width (array-dimension map 1)))
    (labels ((find-in-col (r c)
               (when (< c width)
                 (let ((char (aref map r c)))
                   (case char
                     (#\^ (make-guard-state :pos-r r :pos-c c :dir 0))
                     (#\> (make-guard-state :pos-r r :pos-c c :dir 1))
                     (#\v (make-guard-state :pos-r r :pos-c c :dir 2))
                     (#\< (make-guard-state :pos-r r :pos-c c :dir 3))
                     (otherwise (find-in-col r (1+ c)))))))
             (find-in-row (r)
               (when (< r height)
                 (or (find-in-col r 0) (find-in-row (1+ r))))))
      (find-in-row 0))))

(defun trace-patrol (map)
  (let ((start-guard (find-start map)))
    (when (null start-guard) (return-from trace-patrol (values (make-hash-table) nil)))
    (let ((height (array-dimension map 0))
          (width (array-dimension map 1)))
      (setf (aref map (guard-state-pos-r start-guard) (guard-state-pos-c start-guard)) #\.)
      
      (labels ((run (guard visited-locations visited-states)
                 (let ((r (guard-state-pos-r guard))
                       (c (guard-state-pos-c guard))
                       (dir (guard-state-dir guard)))
                   (when (gethash (list r c dir) visited-states)
                     (return-from trace-patrol (values visited-locations t)))
                   (setf (gethash (list r c dir) visited-states) t)
                   (setf (gethash (list r c) visited-locations) t)
                   (let* ((dr (aref #(-1 0 1 0) dir))
                          (dc (aref #(0 1 0 -1) dir))
                          (next-r (+ r dr))
                          (next-c (+ c dc)))
                     (if (or (< next-r 0) (>= next-r height) (< next-c 0) (>= next-c width))
                         (values visited-locations nil)
                         (if (char= (aref map next-r next-c) #\#)
                             (run (make-guard-state :pos-r r :pos-c c :dir (mod (1+ dir) 4)) visited-locations visited-states)
                             (run (make-guard-state :pos-r next-r :pos-c next-c :dir dir) visited-locations visited-states)))))))
        (run start-guard (make-hash-table :test 'equal) (make-hash-table :test 'equal))))))

(defun solve-part1 (map)
  (multiple-value-bind (visited-locs found-loop) (trace-patrol (copy-map map))
    (declare (ignore found-loop))
    (hash-table-count visited-locs)))

(defun solve-part2 (map)
  (let ((start-pos (find-start map))
        (height (array-dimension map 0))
        (width (array-dimension map 1)))
    (labels ((find-spots (r c count)
               (cond ((>= r height) count)
                     ((>= c width) (find-spots (1+ r) 0 count))
                     (t (let ((new-count
                                (if (and (char= (aref map r c) #\.)
                                         (not (and start-pos (= r (guard-state-pos-r start-pos)) (= c (guard-state-pos-c start-pos)))))
                                    (let ((temp-map (copy-map map)))
                                      (setf (aref temp-map r c) #\#)
                                      (multiple-value-bind (_ found-loop) (trace-patrol temp-map)
                                        (declare (ignore _))
                                        (if found-loop (1+ count) count)))
                                    count)))
                          (find-spots r (1+ c) new-count))))))
      (find-spots 0 0 0))))

(defun solve (filepath)
  (let* ((lines (uiop:read-file-lines filepath))
         (map (parse-map lines)))
    (format nil "Part 1: ~a~%Part 2: ~a"
            (solve-part1 (copy-map map))
            (solve-part2 map))))

1 comment:

Anonymous said...

So not only are the AIs coming for our jobs, we have to flatter them effusively when prompting them.