Going Automatic
I've decided to use this project as a testbed for automatic development, to explore the kinds of techniques described in recent results [1] [2], and to speed up development. I'll be making the vast majority of changes to the codebase via coding agents, exploring what controls are needed to stay hands-off yet maintain code quality.
Short AGENTS.md
I've played around with longer AGENTS.md and not seen much value. Agents would ignore pieces, and after a while it felt like simply a waste of context space. I'm keeping AGENTS.md short this time around, "linking" to where more information can be retrieved, with the hope that brevity keeps the content important.
Guidance Documentation
To help guide agent behavior, I'll be encoding principles, architecture, and anything else into markdown files in the docs directory. To further keep context targeted, I'm instituting two hops — AGENTS.md links to docs/index.md, which in turn links to the available docs.
# Documentation Index
* `docs/index.md`: This file — brief description of every document in `docs/`.
* `docs/bin/principles.md`: Rules for code that lives in `bin/`.
* `docs/lib/architecture.md`: Module strategy and layering for `lib/` — where to put new logic, new concepts, new files.
* `docs/lib/principles.md`: Rules for code that lives in `lib/`.
* `docs/test/principles.md`: Rules for code that lives in `test`.
I am leaning on all the training being done on LLMs to understand and navigate directories to help the agent make decisions on docs to load based on code it's considering editing. Similarly to AGENTS.md, I intend to keep these documents short and focused.
"Linting"
The reports say that linting, along with any other tool that blocks progress with an explanation, is navigable by agents. Agents hit the block, try something that fixes the block or goes around it, continuing to try until the linting passes. My initial investigation of linting didn't reveal a clear and easy winner to pick up. I'll dig deeper, but in the meantime I decided to turn all compiler warnings on and have them fail the build.
Some of the warnings aren't useful for different parts of the application, so there are different sets of warnings turned off overall and for bin, test, and test-integration.
Testing
I've set up integration tests in order to have a more easily verified mechanism for ensuring correctness of the application. They are set up to execute the program in a controlled directory and assert on the output or the state of the database. I plan to review these tests thoroughly, while reviewing the application code less rigorously.
The unit tests are going to stay in place since I already spent effort to set them up in the past. I'm not sure whether they will help or hurt the speed of development, the quality of the code, or the correctness. For now I will let them be and see how things turn out.
New Functionality
Using this new automatic development approach, I've added a command-line argument parsing top-level to the application, and two subcommands. Working with different kinds of modeled "objects" from the command line makes me think of git's subcommands, so we'll separate functionality into subcommands. The two subcommands are init and add, initializing a database at the root of a git repository, and adding a note to the database respectively.