New Software Engineering Modes
Complementing Gas Town, a team of engineers at StrongDM have coined the term "Software Factories" (via) for a different kind of coding agent software engineering methodology. They get straight to the heart of things:
Code must not be written by humans
Code must not be reviewed by humans
Coding agents write code faster than a human can read and understand it. This has the potential to be very valuable — quickly producing working programs on its own, but also the amount of work a single engineer can ship. To sustain that speed, the code cannot be reviewed. I think most people who've worked with coding agents have seen that they can tap into the speed, but then you end up forced to slow down and stretch your code review skills. What would need to change to make full use of the speed?
I think you'd need, at the very least, no code review of the application being built, and likely no unit tests (any test which asserts on internal implementation). Instead you treat the application as a black box and build infrastructure which tests observable outcomes. Simon Willison also sees this: "It imitates aggressive testing by an external QA team—an expensive but highly effective way of ensuring quality in traditional software." This mode of engineering leaves the coding agent free to write, delete, and rearrange code as it sees fit, and eliminates communication overhead from code reviews within the implementation.
They don't really talk about it in the post, but I'm expecting that there's some review of the end-to-end tests and specifications between team members. There is still value in review between team members — humans simply make mistakes and forget things; we are better together.
Downsides: I don't think this mode of software engineering will take over the entire industry, because coding agents have downsides. It's not clear to me that we will reach the model improvements necessary to do away with the traditional software engineering mode we mostly still employ — we're seeing gains due to "thinking mode" and reinforcement learning applied onto the end of model training, but we appear to have mined out the improvements from simply making the models larger. Coding agents are still pretty bad about:
- Error handling. They often litter the codebase with unnecessary propagations of impossible states, like checking nulls, unnecessarily unboxing option and result types, catching and re-throwing exceptions — the epitome of defensive programming. Alternatively they also frequently miss handling or even considering error conditions.
- Performance. Coding agents routinely cut corners directly to solutions. They don't tend to pay attention to the performance implications of code they generate unless instructed to. The mass amounts of generated code can often be complex enough to obscure potential performance improvements.
- Complexity. My jury is still out on whether coding agents can scale their intelligence to large complexity codebases that were the product of the rapid generation of code. Agents still routinely duplicate concepts and reimplement logic in multiple ways rather than abstracting. The tests of agent-handling-complexity I have seen have been on large codebases already organized and arranged for understanding by human minds, not on ones haphazardly slapped together the way coding agents currently want to.
All of these downsides tell me that we will still employ traditional software engineering (perhaps with coding agents used surgically) for applications and libraries where performance, security, and correctness are paramount, for foundational pieces of software. As coding agent performance improves and costs decrease, I think we'll see more teams embrace this software factory mode for applications and services that won't be severely impacted by the above downsides. The benefits of rapid feature development and avoidance of Brooks's law will be undeniable.