More Software Engineering Research

Strategies for guiding coding agents.

2026-02-15 at 5:0

OpenAI has put out yet another software engineering report on coding agent use, closer to StrongDM's Software Factory than to Gas Town.

Ryan Lopopolo for OpenAI on 2026-02-11:
Over the past five months, our team has been running an experiment: building and shipping an internal beta of a software product with 0 lines of manually-written code.

The product has internal daily users and external alpha testers. It ships, deploys, breaks, and gets fixed. What’s different is that every line of code—application logic, tests, CI configuration, documentation, observability, and internal tooling—has been written by Codex.

Humans steer. Agents execute.

OpenAI provides a number of interesting details here that, I think, complement the practices described by StrongDM. They started by reviewing Codex's commits, and that review load shrank drastically over time as every correction was worked into a set of guiding documents that agents would automatically pick up. It sounds like they did human-to-human reviews of feature and guidance documents. The picture they paint makes a lot of sense — encoding practical software engineering standards into tooling and guidelines documents, and then routinely "garbage collecting" using agents prompted specifically to review and clean up. An interesting thing to me is that they found "rolling forward" via speedy coding agent code generation to be faster and less disruptive to the process than rolling back bugs.